[syslinux] Very slow download with pxelinux > 4.07 on specific hardware

Eric PEYREMORTE eric.peyremorte at iut-valence.fr
Fri Mar 14 16:31:32 PDT 2014


Le 14/03/2014 18:40, H. Peter Anvin a écrit :
> On 03/13/2014 04:09 AM, Eric PEYREMORTE wrote:
>> Le 12/03/2014 22:00, H. Peter Anvin a écrit :
>>> On 03/10/2014 04:15 PM, Gene Cumm wrote:
>>>> It's also a balance of time.  While working on 4.10-pre*/5.10-pre*, I
>>>> found that some hardware misreports its behavior.  "Sure, Interrupts
>>>> work" but they don't is but one that I worked around on specific
>>>> hardware.
>>>>
>>> The odd part is that people are reporting this even using the legacy PXE
>>> implementation (not lpxelinux.0)...
>>>
>>>      -hpa
>> If there is a way to get useful debug traces let me know.
>>
>> By the way, everything is slow from the moment the following string
>> appears :
>>
>> PXELINUX 5.10 0x5321850f
>>
> I am *assuming* you are seeing the full copyright banner here, not just
> the above string (dumb question, I know, but sometimes it really, really
> matters.)
Yes sorry, i see the full copyright :-)
>> I tried to search through the code, compare different versions to
>> understand what's wrong, but i definitely don't have the required
>> skills....
> This is very challenging.  One of the big problems is that the legacy
> network code (pxelinux.0 as opposed to lpxelinux.0) was pulled out and
> then pulled back in, and clearly something changed in the process.
And there is a large amount of change in the code between version, so 
not easy to guess
which change is the culprit.
>
> I looked over your wire trace and there is a fixed amount of delay --
> just under 20 ms -- between each packet, which strongly implies that it
> ends up waiting for some kind of timer to expire.  *What* timer that is
> is less clear, because the only *architectural* timer is the 55 ms timer
> interrupt, which doesn't fit the observed time.  That implies this is a
> timer inside the PXE code.  Why that didn't happen before and does now
> is the real mystery.
>
>> What i notice from the wireshark traces, is that pxelinux.0 is loaded
>> really quickly. Then it fetches ldlinux.c32 very slowly (for the next
>> files too)
>>
>> For lpxelinux.0, from the trace, everything is slow too, but at some
>> point, the client seems stuck in a loop sending acknowledgement for a
>> packet again and again. The server tries to send the next packet but the
>> clients keeps sending ack for the previous one.
> Right... this implies that the receiver stopped functioning so the
> machine "went deaf".  That is a fairly common failure mode, but why it
> happens here is again the big question.
At first i thought about a semaphore problem somewhere, but the code is 
too complex

>
> Unfortunately I only have my "spare time" to work on Syslinux anymore,
> which makes hard problems like this difficult to dig into.  I *really*
> appreciate the debugging information you have already given us... it
> gives us a starting point at least.  The 20 ms delay is a very important
> clue.
>
> 	-hpa
Don't worry, for now i'm sticking with pxelinux 4.07 + ipxe for fast 
http download.
But if you or anyone else need me to make more test later, i can help :-)

Thanks for your answer,
Eric



More information about the Syslinux mailing list