[syslinux] lpxelinux.0 issues with larger initrd.img files from RHEL >= 7.5 on UCS servers?
hpa at zytor.com
hpa at zytor.com
Wed Jun 19 13:05:50 PDT 2019
On June 19, 2019 12:21:05 PM PDT, Mathieu Chouquet-Stringer <m+syslinux at thi.eu.com> wrote:
> Hello,
>
>On Tue, Jun 18, 2019 at 05:31:17PM -0700, hpa at zytor.com wrote:
>> Which servers, what threshold, what clients, what about pxelinux.0?
>
>All affected servers so far are Cisco UCS B200 M3 blade servers.
>
>The threshold seems to be around 50MB, I haven't tested precisely but
>54525200 bytes is enough to trigger the reboot while 49812292 isn't
>(what I did was to recompress the initrd with a higher compression
>setting).
>
>I tried today pxelinux.0 instead of lpxelinux.0 and it behaves the same
>way: if the initrd is "too big", the server reboots.
>
>What do you mean by "clients"?
>
>Please find below what I initially wrote in my email, let me know if or
>how I can help debug that.
>
>Cheers,
>
>> Hello,
>>
>> I am using lpxelinux.0 (latest stable version 6.03, using the
>official binaries
>> from kernel.org) to kickstart servers as http transfers really helps
>over links
>> with poor latencies... These servers are being booted in legacy
>mode,
>> not in UEFI.
>>
>> This has worked very well until recently. Starting with RHEL 7.5, on
>> some servers, we would see the machine rebooting while pxelinux is
>in the
>> middle of downloading the initrd.img file.
>>
>> I quick workaround was to tell people to boot/kickstart using a
>previous
>> minor (last working was 7.4): the install process taking care of
>updating to
>> the latest version.
>>
>> I experienced the same issue again yesterday and had time to think
>about
>> it again.
>>
>> On rhel 7.4, the last working version in my case, the file is that
>big:
>> -rw-r--r-- 1 root root 49763300 Dec 1 2017 initrd.img
>>
>> I saw on 7.5 and 7.6 they're slightly larger, respectively:
>> -r--r--r-- 1 root root 54525200 Mar 22 2018 initrd.img
>> and
>> -r--r--r-- 1 root root 54799220 Oct 10 2018 initrd.img
>>
>> Because I have no trace on the screen to explain the reboot (same
>thing
>> with a recorded session over a serial console), I was like: what if
>the
>> size is a factor? Given I had nothing else to try, I looked at the
>files
>> and saw they were compressed with xz.
>>
>> So I wondered, what if I compressed them more (I guessed they were
>compressed
>> with the default compression preset)? After uncompressing and
>compressing with
>> xz -9 -C crc32, here's what I get for 7.5 and 7.6 respectively:
>> -r--r--r-- 1 root root 48823576 May 21 11:45 initrd.img
>> and
>> -r--r--r-- 1 root root 49812292 May 21 11:27 initrd.img
>>
>> Pretty close to what I had in 7.4. And to my suprise, with these
>smaller
>> files, lpxelinux doesn't reboot while downloading the file over
>http. The
>> kernel boots up and the OS is installed properly....
>>
>> I haven't had the time to reproduce this issue over regular
>> pxelinux/tftp so I don't know if it's just tied to lpxelinux/http or
>> not.
>>
>> Also, so far this bug only seems to be triggered on some Cisco UCS
>> servers such as UCSB-B200-M3 like the one described below. So maybe
>it
>> could be related to BIOS or memory maps, I am not sure!?
>>
>> I'd hate to go back to tftp because the switch to http was such a
>huge
>> step forward.
>>
>> Is there something I could do or provide to help debug this issue? I
>> read the "Hardware Compatibility" and "Common Problems" pages on the
>> wiki and found nothing close to what I'm seeing. I started reading
>the
>> "Development/Debugging" but while I could use "COM32 debug.c32" to
>get
>> more details, but I don't know which functions I should be tracing?
>>
>> Please let me know.
>>
>> Cheers,
>> Mathieu
>>
>> BIOS Information
>> Vendor: Cisco Systems, Inc.
>> Version: B200M3.2.2.6f.0.052120182033
>> Release Date: 05/21/2018
>> Address: 0xF0000
>> Runtime Size: 64 kB
>> ROM Size: 4096 kB
>> Characteristics:
>> PCI is supported
>> BIOS is upgradeable
>> BIOS shadowing is allowed
>> Boot from CD is supported
>> Selectable boot is supported
>> BIOS ROM is socketed
>> EDD is supported
>> 5.25"/1.2 MB floppy services are supported (int
>13h)
>> 3.5"/720 kB floppy services are supported (int
>13h)
>> 3.5"/2.88 MB floppy services are supported (int
>13h)
>> Print screen service is supported (int 5h)
>> 8042 keyboard services are supported (int 9h)
>> Serial services are supported (int 14h)
>> Printer services are supported (int 17h)
>> ACPI is supported
>> USB legacy is supported
>> BIOS boot specification is supported
>> Targeted content distribution is supported
>> UEFI is supported
>> BIOS Revision: 4.6
>>
>> System Information
>> Manufacturer: Cisco Systems Inc
>> Product Name: UCSB-B200-M3
>> Version: 1
>> Serial Number: MYSERIALNUMBER
>> UUID: SOMEUUID
>> Wake-up Type: Other
>> SKU Number:
>> Family:
Sounds like you may want to contact Cisco...
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
More information about the Syslinux
mailing list