[syslinux] PXELINUX: how to debug initrd corruption?

Andy Walls awalls at md.metrocast.net
Sat Feb 11 16:11:27 PST 2012


On Sun, 2012-01-08 at 17:49 -0500, Andy Walls wrote:
> On Sat, 2012-01-07 at 22:27 -0500, Gene Cumm wrote:
> > On Jan 7, 2012 7:15 PM, "Andy Walls" <awalls at md.metrocast.net> wrote:
> > >
> > > Hi All,
> > >
> > > I have a system, that, ~80% of the time, fails to properly boot using
> > > pxelinux.0 as provided by RHEL 5.5 (syslinux 3.11, I think).  The
> > > gripe, before the console messages stop, is a message that comes from
> > > the linux kernel complaining the compressed ram drive image is bad:
> > > "invalid compressed format (err=[small number])".

> This is an NFS root setup, so I'm not
> trying to TFTP over the whole system.  Also, about 1 out of every 5
> times, the system does boot completely sucessfully.
> 

> > > From the linker file, I know that .earlybss starts at 0x800 (or 0x1000),
> > > and a number of sections directly follow it.
> > > I also realize some allocated regions are system specific, depending on
> > > what the BIOS returns in its E820 map.
> > > I am, not surprisingly, interested in the regions where PXELINUX stores
> > > the kernel and initramdisk images after TFTP-ing them over.
> > 
> > PXELINUX itself stays in the first 1 MiB.  With regards to e820, I know the
> > current version does pay attention to these regions to fail loading.
> > linux.c32 will also take these into account to be more intelligent.  I
> > believe in at least some scenarios, kernel/initrd start at 1 MiB.  However,
> > 3.11 is old enough I can't speak for its internals.
> > 
> > Any chance you could use meminfo.c32 (not in 3.11), screenshot or
> > transcibed?
> 
> The next chance I get, I can run that on the problem system.  Though,
> since the Linux kernel actually boots, before failing to decompress the
> ram disk, I can capture the E820 map info the kernel emits.
> 
> My hypothesis is that the EFI firmware (proving BIOS services), an
> option ROM, and/or the PXE boot agent is not updating the E820 map or
> providing accurate memory usage information, and PXELINUX ends up
> colliding with one of their data regions when storing the kernel and
> initrd image.

Just following up on this.

It appears that the EFI firmware and PXELINUX v3.11 were stomping on
each other in the memory range 0x800-0xfff.  Upgrading to PXELINUX v4.05
resolved the problem for me.

Sadly the E820 map (and hence meminfo.c32) did not report the use of
0x800-0xfff by the firmware (or 0x0-7ff for that matter).  The memmap
command from the EFI shell did report the usage of 0x0-0xfff however.  

Crummy firmware code.... :(

Thanks for the help.

Regards,
Andy




More information about the Syslinux mailing list