[syslinux] PXELINUX: how to debug initrd corruption?

Andy Walls awalls at md.metrocast.net
Sun Jan 8 14:49:25 PST 2012


On Sat, 2012-01-07 at 22:27 -0500, Gene Cumm wrote:
> On Jan 7, 2012 7:15 PM, "Andy Walls" <awalls at md.metrocast.net> wrote:
> >
> > Hi All,
> >
> > I have a system, that, ~80% of the time, fails to properly boot using
> > pxelinux.0 as provided by RHEL 5.5 (syslinux 3.11, I think).  The
> > gripe, before the console messages stop, is a message that comes from
> > the linux kernel complaining the compressed ram drive image is bad:
> > "invalid compressed format (err=[small number])".
> >
> > I have two questions:
> >
> > 1. How should I go about debugging the problem to find the root cause of
> > the initrd image corruption?
> >
> > I've examined the TFTP transfers with Wireshark; they appear OK.
> > I have not tried a more modern pxelinux.0 yet.

Hi Gene,

I won't have physical access to the system again in about a week or so.
(I was asking the questions to ensure I had the right set of tools and
tests when I get access to it again.)

However, I'll try to answer the best I can from memory:

> How big are the kernel/initrd individually?  How much RAM does the system
> have?  What tftpd?  Some tftpds don't handle rollover and will refuse to
> transmit or truncate.

The init ram drive is < 32MB when uncompressed IIRC, and the compressed
kernel is smaller than that.  This is an NFS root setup, so I'm not
trying to TFTP over the whole system.  Also, about 1 out of every 5
times, the system does boot completely sucessfully.

I'm using the stock in.tftpd on the server.  I can switch to atfpd or
tfpd-hpa as a troubleshooting step.  From the list archives, I see that
atfpd doesn't handle rollover, but AFAIK, I'm not near the 1408*65535 =
87.999 MiB limit.

> > 2. Can anyone provide a description or simple diagram of what regions of
> > system memory PXELINUX uses?
> >
> > From the linker file, I know that .earlybss starts at 0x800 (or 0x1000),
> > and a number of sections directly follow it.
> > I also realize some allocated regions are system specific, depending on
> > what the BIOS returns in its E820 map.
> > I am, not surprisingly, interested in the regions where PXELINUX stores
> > the kernel and initramdisk images after TFTP-ing them over.
> 
> PXELINUX itself stays in the first 1 MiB.  With regards to e820, I know the
> current version does pay attention to these regions to fail loading.
> linux.c32 will also take these into account to be more intelligent.  I
> believe in at least some scenarios, kernel/initrd start at 1 MiB.  However,
> 3.11 is old enough I can't speak for its internals.
> 
> Any chance you could use meminfo.c32 (not in 3.11), screenshot or
> transcibed?

The next chance I get, I can run that on the problem system.  Though,
since the Linux kernel actually boots, before failing to decompress the
ram disk, I can capture the E820 map info the kernel emits.

My hypothesis is that the EFI firmware (proving BIOS services), an
option ROM, and/or the PXE boot agent is not updating the E820 map or
providing accurate memory usage information, and PXELINUX ends up
colliding with one of their data regions when storing the kernel and
initrd image.

My fallback hypothesis is I'm running into an old PXELINUX bug.  Maybe
something like this I found in the git log:

        commit dd01ec19a62c769e37416878dbe63989d2526660
        Author: H. Peter Anvin <hpa at zytor.com>
        Date:   Sun Feb 14 15:09:04 2010 -0800
        
            bootsect.inc: change 100000h -> free_high_memory
            
            We can't load stuff at 100000h... that will overwrite the PM code.
            That is what free_high_memory is for.
            


> I'd also advise boot testing with a more current version.

Agree.  I plan to go in with the latest git clone from: 

	git://git.kernel.org/pub/scm/boot/syslinux/syslinux.git

It sounds like many tools for examining the system are in the com32
directory.  Are there any tools in there that will not work with a
serial console?

Regards,
Andy

> --
> -Gene





More information about the Syslinux mailing list