[syslinux] gpxelinux.0 and slow HTTP performance on VMware ESX VM

Wed Jun 29 14:57:00 PDT 2011

On Wed, Jun 29, 2011 at 14:57, Schlomo Schapiro
<syslinux at schlomo.schapiro.org> wrote:
> Hi,
>
> first of all I would like to voice my deep gratitude to all syslinux developers
> for this really important software. I am using it in all my automation projects
> and could not manage without.
>
> Unfortunately now I stumbled upon a problem where I am out of my wits and need
> some help.
>
> The core problem is that HTTP transfers by gpxelinux.0 are very slow. Sadly this
> problem seems to be somehow related to our VMware ESX environment and I am not
> able to pin the problem down.
>
> Please bear with me while I try to give you a picture of what we are doing.
>
> We have a VMware ESX VM that serves as boot and installation server, hosting
> DHCP, DNS, TFTP and HTTP services for 2 networks (called "d" and "a"). The "d"
> network consist mostly of desktop computers (mostly Dell T5500 workstations)
> while the "a" network consists exclusively of VMware ESX 4.1 VMs.

Which network is the boot/install server attached to, "d", "a" or
another?  If "a", do you see an issue when the booting VM is on the
same host as the boot/install server VM?

> The boot configuration is shared between the networks and looks like this:
>
> DHCP filename: gpxelinux.0 (from syslinux 4.04)
> pxelinux prefix: http://server.domain/boot
>
> http://server.domain/boot/pxelinux.cfg/default loads a vesamenu.c32 based menu
> structure that allows various installs of Ubuntu, RHEL and CentOS, all
> accessible exclusively via HTTP (e.g. kernel
> http://server.domain/centos/5/x86_64/.../vmlinuz)
>
> Booting a desktop system on the "d" network goes really fast, and loading the
> 35MB initrd of the RHEL6.1 installer takes 1-2 seconds.
>
> Booting a VM on the "a" network is much slower, loading the same 35MB initrd
> from the same URL takes >20 seconds. Also, wireshark on the boot server shows
> lots of TCP retransmissions and duplicate ACK packets. Also, about 10-20% of all
> boot attempts on the "a" network fail by either getting stuck in vesamenu.c32 or
> by reporting an error, aborting the boot and rebooting after the pxelinux reboot
> timeout.
>
> Some other things we also noticed:
> * TFTP on the "a" network is much faster than HTTP, but still a small fraction
> of boot attempts fail.
> * some TFTP requests seem to come twice, e.g. I see two log entries for
> gpxelinux.0, but only one for vesamenu.c32.
> * we where not able to find any difference in the network configuration.
> * The VMware VM network card type seems to have no effect on the HTTP transfer
> times, at least e1000 and vmxnet3 behave similar.

You should be able to also use an older one that may still have issues
with PXELINUX 4.10-pre15.  Configure a VM as something like "Other 2.6
Linux" and it should act like a PCnet32 for PXE.

> * using gpxe and vesamenu.c32 directly fails with vesamenu.c32 >3.85 (or so) and
> simply reboots the system.

Using vesamenu.c32 from Syslinux 4.xx will do this; the issue is that
gPXE doesn't recognize the fact that it's a COM32R rather than a 3.xx
COM32.

> The questions we have are the following:
>
> * Is there any known issue with gpxelinux.0 and HTTP transfers on VMware ESX (4.1)?

I'm pretty sure I've tried this personally and not had issues.  I know
with those two NICs, I have not had any issues.

> * How could we debug gpxelinux.0 and HTTP transfers, specifically find out why
> there are so many TCP retransmits?
>
> * Is it possible to tune the HTTP and TFTP protocol engines, e.g. timeouts,
> retries etc.?
>
> * Can you give us any advice on how to troubleshoot such a problem?

I've already got a similar system.  I'll try some tests myself.
Non-bug guesses include a vSwitch or real switch with issues or
bandwidth throttling active, a bad NIC or switch port and an
overloaded host/network.

> We'll be happy to try out anything that could help, this boot issue is basically
> the last problem in setting up a fairly large self-service virtualization
> environment for our developers. The reason we need to use HTTP booting is that
> we provide the pxelinux configuration dynamically and use this as a quality gate
> that new VMs have to pass before they are allowed to boot. This is the core of
> our self-service virtualization which is published as "Lab Manager Light" on
> http://blog.schlomo.schapiro.org/2011/05/lab-manager-light-self-service.html.

Sounds like an interesting project.

-- 
-Gene