[syslinux] lpxelinux.0 issues with larger initrd.img files from RHEL >= 7.5 on UCS servers?

hpa at zytor.com hpa at zytor.com
Wed Jun 19 13:05:50 PDT 2019


On June 19, 2019 12:21:05 PM PDT, Mathieu Chouquet-Stringer <m+syslinux at thi.eu.com> wrote:
>	Hello,
>
>On Tue, Jun 18, 2019 at 05:31:17PM -0700, hpa at zytor.com wrote:
>> Which servers, what threshold, what clients, what about pxelinux.0?
>
>All affected servers so far are Cisco UCS B200 M3 blade servers.
>
>The threshold seems to be around 50MB, I haven't tested precisely but
>54525200 bytes is enough to trigger the reboot while 49812292 isn't
>(what I did was to recompress the initrd with a higher compression
>setting).
>
>I tried today pxelinux.0 instead of lpxelinux.0 and it behaves the same
>way: if the initrd is "too big", the server reboots.
>
>What do you mean by "clients"?
>
>Please find below what I initially wrote in my email, let me know if or
>how I can help debug that.
>
>Cheers,
>
>>  	Hello,
>>  
>>  I am using lpxelinux.0 (latest stable version 6.03, using the
>official binaries
>>  from kernel.org) to kickstart servers as http transfers really helps
>over links
>>  with poor latencies... These servers are being booted in legacy
>mode,
>>  not in UEFI.
>>  
>>  This has worked very well until recently. Starting with RHEL 7.5, on
>>  some servers, we would see the machine rebooting while pxelinux is
>in the
>>  middle of downloading the initrd.img file.
>>  
>>  I quick workaround was to tell people to boot/kickstart using a
>previous
>>  minor (last working was 7.4): the install process taking care of
>updating to
>>  the latest version.
>>  
>>  I experienced the same issue again yesterday and had time to think
>about
>>  it again.
>>  
>>  On rhel 7.4, the last working version in my case, the file is that
>big:
>>     -rw-r--r--   1 root     root     49763300 Dec  1  2017 initrd.img
>>  
>>  I saw on 7.5 and 7.6 they're slightly larger, respectively:
>>     -r--r--r--   1 root     root     54525200 Mar 22  2018 initrd.img
>>  and
>>     -r--r--r--   1 root     root     54799220 Oct 10  2018 initrd.img
>>  
>>  Because I have no trace on the screen to explain the reboot (same
>thing
>>  with a recorded session over a serial console), I was like: what if
>the
>>  size is a factor? Given I had nothing else to try, I looked at the
>files
>>  and saw they were compressed with xz.
>>  
>>  So I wondered, what if I compressed them more (I guessed they were
>compressed
>>  with the default compression preset)? After uncompressing and
>compressing with
>>  xz -9 -C crc32, here's what I get for 7.5 and 7.6 respectively:
>>     -r--r--r--   1 root     root     48823576 May 21 11:45 initrd.img
>>  and
>>     -r--r--r--   1 root     root     49812292 May 21 11:27 initrd.img
>>  
>>  Pretty close to what I had in 7.4. And to my suprise, with these
>smaller
>>  files, lpxelinux doesn't reboot while downloading the file over
>http. The
>>  kernel boots up and the OS is installed properly....
>>  
>>  I haven't had the time to reproduce this issue over regular
>>  pxelinux/tftp so I don't know if it's just tied to lpxelinux/http or
>>  not.
>>  
>>  Also, so far this bug only seems to be triggered on some  Cisco UCS
>>  servers such as UCSB-B200-M3 like the one described below. So maybe
>it
>>  could be related to BIOS or memory maps, I am not sure!?
>>  
>>  I'd hate to go back to tftp because the switch to http was such a
>huge
>>  step forward.
>>  
>>  Is there something I could do or provide to help debug this issue? I
>>  read the "Hardware Compatibility" and "Common Problems" pages on the
>>  wiki and found nothing close to what I'm seeing. I started reading
>the
>>  "Development/Debugging" but while I could use "COM32 debug.c32" to
>get
>>  more details, but I don't know which functions I should be tracing?
>>  
>>  Please let me know.
>>  
>>  Cheers,
>>  Mathieu
>>  
>>  BIOS Information
>>             Vendor: Cisco Systems, Inc.
>>             Version: B200M3.2.2.6f.0.052120182033
>>             Release Date: 05/21/2018
>>             Address: 0xF0000
>>             Runtime Size: 64 kB
>>             ROM Size: 4096 kB
>>             Characteristics:
>>                     PCI is supported
>>                     BIOS is upgradeable
>>                     BIOS shadowing is allowed
>>                     Boot from CD is supported
>>                     Selectable boot is supported
>>                     BIOS ROM is socketed
>>                     EDD is supported
>>                     5.25"/1.2 MB floppy services are supported (int
>13h)
>>                     3.5"/720 kB floppy services are supported (int
>13h)
>>                     3.5"/2.88 MB floppy services are supported (int
>13h)
>>                     Print screen service is supported (int 5h)
>>                     8042 keyboard services are supported (int 9h)
>>                     Serial services are supported (int 14h)
>>                     Printer services are supported (int 17h)
>>                     ACPI is supported
>>                     USB legacy is supported
>>                     BIOS boot specification is supported
>>                     Targeted content distribution is supported
>>                     UEFI is supported
>>             BIOS Revision: 4.6
>>  
>>  System Information
>>             Manufacturer: Cisco Systems Inc
>>             Product Name: UCSB-B200-M3
>>             Version: 1
>>             Serial Number: MYSERIALNUMBER
>>             UUID: SOMEUUID
>>             Wake-up Type: Other
>>             SKU Number:
>>             Family:

Sounds like you may want to contact Cisco...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



More information about the Syslinux mailing list