[syslinux] syslinux.efi problem TFTPing ldlinux.e64 on hyper-v gen2 netboot

John Kennedy jkennedy314159 at outlook.com
Mon Apr 13 12:38:27 PDT 2015


I've found a problem, and I think it is in syslinux (specifically when it tries to load ldlinux.e64 under hyper-V).

I have a physical computer with a Gigabyte GA-P85-D3 motherboard that successfully pxeboots using the EFI images, and I hold that up as exhibit 1 that my deployment environment works.  It also implies that the syslinux.efi image works in some cases, but not in all.

My Hyper-V (manager) version identifies itself as 6.3.9600.16384 running on Windows 8.1.
The VM is "Version 5.0, Generation 2" (via VM summary), using the only available network adapter.
Secure boot is disabled (although that probably would have burned me loading the initial syslinux.efi).

The syslinux.efi and ldlinux.e64 are coming from syslinux-6.03.tar.gz (from www.kernel.org).

Here is a bootsequence for the physical system that works:

        Apr  8 06:44:01 localhost dhcpd: DHCPDISCOVER from 74:d4:35:b3:45:f1 via eth0
        Apr  8 06:44:01 localhost dhcpd: DHCPOFFER on 10.0.1.70 to 74:d4:35:b3:45:f1 via eth0
        Apr  8 06:44:05 localhost dhcpd: DHCPREQUEST for 10.0.1.70 (10.0.1.241) from 74:d4:35:b3:45:f1 via eth0
        Apr  8 06:44:05 localhost dhcpd: DHCPACK on 10.0.1.70 to 74:d4:35:b3:45:f1 via eth0
        Apr  8 06:44:05 localhost in.tftpd[4696]: RRQ from 10.0.1.70 filename syslinux.efi
        Apr  8 06:44:05 localhost in.tftpd[4696]: tftp: client does not accept options
        Apr  8 06:44:05 localhost in.tftpd[4697]: RRQ from 10.0.1.70 filename syslinux.efi
        Apr  8 06:44:06 localhost in.tftpd[4698]: RRQ from 10.0.1.70 filename ldlinux.e64
        Apr  8 06:44:06 localhost in.tftpd[4699]: RRQ from 10.0.1.70 filename pxelinux.cfg/7402d403-3504-b305-4506-f10700080009
        Apr  8 06:44:06 localhost in.tftpd[4700]: RRQ from 10.0.1.70 filename pxelinux.cfg/01-74-d4-35-b3-45-f1
        Apr  8 06:44:07 localhost in.tftpd[4701]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A000146
        Apr  8 06:44:07 localhost in.tftpd[4702]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A00014
        Apr  8 06:44:07 localhost in.tftpd[4703]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A0001
        Apr  8 06:44:07 localhost in.tftpd[4704]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A000
        Apr  8 06:44:07 localhost in.tftpd[4705]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A00
        Apr  8 06:44:07 localhost in.tftpd[4706]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A0
        Apr  8 06:44:07 localhost in.tftpd[4707]: RRQ from 10.0.1.70 filename pxelinux.cfg/0A
        Apr  8 06:44:07 localhost in.tftpd[4708]: RRQ from 10.0.1.70 filename pxelinux.cfg/0
        Apr  8 06:44:07 localhost in.tftpd[4709]: RRQ from 10.0.1.70 filename pxelinux.cfg/default
        Apr  8 06:44:08 localhost in.tftpd[4710]: RRQ from 10.0.1.70 filename vmlinuz
        Apr  8 06:44:21 localhost in.tftpd[4711]: RRQ from 10.0.1.70 filename initrd.img

The failed system only makes it to *starting* to download ldlinux.e64:

        Apr  8 08:18:26 localhost dhcpd: DHCPDISCOVER from 00:15:5d:01:d5:05 via eth1
        Apr  8 08:18:26 localhost dhcpd: DHCPOFFER on 10.10.0.71 to 00:15:5d:01:d5:05 via eth1
        Apr  8 08:18:30 localhost dhcpd: DHCPREQUEST for 10.10.0.71 (10.10.0.1) from 00:15:5d:01:d5:05 via eth1
        Apr  8 08:18:30 localhost dhcpd: DHCPACK on 10.10.0.71 to 00:15:5d:01:d5:05 via eth1
        Apr  8 08:18:30 localhost in.tftpd[3662]: RRQ from 10.10.0.71 filename syslinux.efi
        Apr  8 08:18:30 localhost in.tftpd[3662]: tftp: client does not accept options
        Apr  8 08:18:30 localhost in.tftpd[3663]: RRQ from 10.10.0.71 filename syslinux.efi
        Apr  8 08:18:30 localhost in.tftpd[3664]: RRQ from 10.10.0.71 filename ldlinux.e64

I say *starting* because it only sends the initial TFTP packet and then times out (which is enough to create the log entry).  I've captured packets from the TFTP server's perspective, but haven't been able to packet-sniff the VM (no port-mirroring ability that I've found).  I also haven't figured out how to recompile syslinux (via GIT) to figure out what is going on from syslinux's perspective or at least crank up the verbosity.

The bad ldlinux.e64 TFTP attempt sends the first packet (TFTP read request) to the TFTP server, and the TFTP server sends the option acknowledgement (with periodic repetition).  The TFTP server never gets a request for blocks. Since the code is the same and the basic information provided to it is correct, I can only assume that something in the hyper-v environment is causing syslinux a problem.

I tried to do some printf-quality hacking via the GIT sources.  syslinux.efi seems to have all the right information (TFTP server, IP, subnet mask) which is verified by the first packet making it.  To me, it looked like it was hanging in the fread() while loading the module, but I don't trust my compilation (the physical system wouldn't boot using the hacked-up version at least).

Suggestions on how to crack this problem open some more?

 		 	   		  


More information about the Syslinux mailing list