[syslinux] [PATCH][GIT-PULL] lwIP undiif: Fixes for VMware platforms and general fixes

Thu Oct 20 18:11:16 PDT 2011

Gene Cumm <gene.cumm at gmail.com> writes:

> On Tue, Oct 18, 2011 at 14:57, Eric W. Biederman <ebiederm at xmission.com> wrote:
>> "H. Peter Anvin" <hpa at zytor.com> writes:
>>
>>> On 10/14/2011 06:23 PM, Gene Cumm wrote:
>>>>>
>>>>> Hi Gene,
>>>>>
>>>>> I have merged this... I'm still not convinced it is the right solution -- in
>>>>> particular I'm not convinced we should *ever* call ethernet_input() -- but
>>>>> it's a lot better than the previous (known broken) code.
>>>>>
>>>>> Thanks!
>>>>
>>>> Thanks also to Simon Goldschmidt who saw the issue and suggested the
>>>> direction of the fix.
>>>>
>>>> It seems the more important question is what could go wrong.  What if
>>>> it's non-Ethernet and calls ethernet_input() or is Ethernet and
>>>> doesn't call ethernet_input() but just ip_input() instead?
>>>>
>>>
>>> The difference, as far as I can tell, is that ethernet_input() expects a
>>> 14-byte Ethernet header, and will handle ARP.  Since the UNDI code
>>> should be doing ARP on its own (it has to, we might not be using
>>> Ethernet; consider Infiniband for example) then we should be able to
>>> just pass the IP packets to ip_input().
>>
>> As I recall the practical difference how we are talking to the undi
>> stack.  In one mode the UNDI stack can give us a raw ethernet header and
>> let us process that according to the ethertype in the packet.  In
>> another mode the undi stack will give us a packet without the link layer
>> header that undi has already classified into as an ip, arp or a rarp
>> packet.  We then have to process that.  Which means undi absolutely
>> requires a network that uses arp to support ip (which includes IB),
>> and that even in undi mode we have to process arp packets.
>>
>> If someone can give me a pointer I will be happy to give this change
>> a bit of review.  Somehow I lost the thread of this conversation.
>
> tcpip_input() drops a packet into tcpip_thread()'s mbox which then
> calls ethernet_input() or ip_input() as "appropriate" which allows the
> sending thread to finish creating the socket before the response
> packet is processed in tcp_input().

tcpip_input always calls ethernet_input if you are an ethernet device.
So the only practical difference is bounding through an extra thread.

Maybe there is a good reason for bouncing through an extra thread.
I seem to have vague memories about it just being plain less efficient
and killing through put.

I did not use tcpip_input deliberately.  At the very least because it
was an apparent unnecessary complication.

I guess I would tend to focus on why are we not ready to receive
a packet?   That sounds like a more straight forward bug to fix than
weird crazy timing issues.

> I think we either need to always not set NETIF_FLAG_ETHARP in
> undiif.flags (and handle ARP inside undiif.c) OR logic in
> tcpip_thread() needs to change such that we can choose where it goes
> by msg.type (probably with 2 new types to force either
> ethernet_input() or ip_input) or by a new field in the struct
> tcpip_msg.

Or we have the third option that if we really need to queue packets
to be processed by a thread write our own little mini-version of
the tcpip_thread that does just what we need and has enough smarts
to cope with the weirdness of undi.

Bouncing through an unnecessary thread is just seems ugly to me.

To make things particularly clean and beautiful the arp layer needs to
be abstracted away from ethernet.  But that hasn't happened yet.

lwip 1.4 has been officially out since May so we may want to look and
see if there are any benefits to upgrading our code base.  I remember do
a preliminary look and seeing that the upgrade would not be particularly
difficult.

Eric