[syslinux] Cannot chain to another PXE server on the same subnet

Jeffrey Hutzelman jhutz at cmu.edu
Thu Mar 6 14:55:31 PST 2014


On Thu, 2014-03-06 at 16:52 -0500, Gene Cumm wrote:

> > RFC2131, section 4.1, and particularly the second paragraph on page 24.
> 
> Conditionally.  "Options may appear only once, unless otherwise
> specified in the options document."  I don't see any indication of any
> options that DO allow it unless "The information is an opaque object
> of n octets" implies arbitrary length.

What that text is trying to say is that a logical _option_ can appear
only once, but that multiple TLV's can appear with the same tag, in
which case they are concatenated together to obtain the length of the
option.

The only exception is if the specification for the option in question
specifically says that the option can appear multiple times, which the
spec for option 43 does not (in fact, I'm not aware of any option that
does, though there are plenty of obscure options outside the base spec
that I haven't heard of).


> > Unfortunately, a lot of implementations get this wrong.  For example,
> > some client implementations cannot handle encapsulated vendor options
> > that span more than once instance of the enclosing option 43.
> 
> If we assume 43 is allowed to be repeated, then the contents are an
> arbitrary blob subject to vendor-specific interpretation but _SHOULD_
> be in encapsulated format.

Right.  But that's the concatenated contents, not the contents of each
TLV with tag 43.  Unfortunately, some clients interpret each one
separately.  In any case, that's not directly the cause of the present
issue, because doing that would actually work with the data the Altiris
server is sending.


> Reading over those portions of the RFCs, it comes off as the
> pack/unpack functions should be able to handle a nearly arbitrary
> number of instances of an option and shouldn't be repacking the option
> instances together.

For an option you know about, repacking is fine.  The implication of the
text from RFC2131 is that some option might specify that multiple TLVs
are treated as separate instances of the option, in which case repacking
that one would not be OK.

FWIW, I think DHCP6 solves this problem by explicitly prohibiting an
option from appearing more than once, period, without any possibility of
individual options behaving differently.



> > In this case, it appears that the Altiris server is expecting the client
> > to process each instance of option 43 separately, rather than
> > concatenating their payloads and processing the result as if it were a
> > single option value.  In particular, the server is ending each instance
> 
> It appears they assumed you put an option 255 at the end of each
> option 43 instance and the client will then strip those 255s.

Yup.


>    If a vendor potentially encodes more than one item of information in
>    this option, then the vendor SHOULD encode the option using
>    "Encapsulated vendor-specific options" as described below:
> 
> SHOULD makes me think a vendor can choose how much of the suggestion to follow.

Yes, that's true(*).  But that's not the same as the DHCP server getting
to choose how much to follow.  "Vendor" here means the vendor whose
definition of option 43 applies to this packet, which is the vendor
whose identifier appears in the DHCP vendor class identifier option.  In
this case, that's "PXEClient", which means the format of option 43 is
defined in the PXE specification.  The PXE specification is
unfortunately rather vague, but does specify the use of encapsulated
vendor options, refers once to RFC2132, and does not define any
different encapsulation format.



In any case, I think I was barking up the wrong tree.  What the trace
shows is a normal DHCP handshake with the server at 10.215.144.7, along
with the Altiris server responding in so-called "Proxy DHCP" mode, in
which it provides PXE configuration but does not assign the client an
address.  That's a standard feature, and it's why the whole thing works
if the primary DHCP server doesn't send any PXE response at all.  If PXE
options are present in the response from the real DHCP server, the PXE
spec requires that they take precedence over options provided by a
"Proxy DHCP" server.


The key thing here is that the Altiris server provides a PXE menu, which
the PXE client is expected to interpret, display for the user, and then
send an appropriate followup PXE request once the user selects an item.
This happens at the PXE/DHCP protocol layer, normally before any
bootstrap program is loaded.  The "bstrap.0" file is a small NBP that
handles the menu when booting from old PXE clients that don't implement
the menu-related parts of the protocol.  But even with bstrap.0, the
menu stuff doesn't work if the menu-related options weren't in whichever
DHCP response gets used for PXE.

Unfortunately, what this means is that selecting things from the Altiris
menu isn't going to work unless the client is getting PXE bits from that
server and _not_ from the main DHCP server.  Getting that to happen may
be rather difficult, as it more or less requires triggering another DHCP
exchange, and doing so in such a way that the main server doesn't
provide any options (for example, by sending something in the request
that the main DHCP server config conditions the PXE response bits on).


-- Jeff


(*) To an extent.  SHOULD or RECOMMENDED in most RFCs, including this
one (even though it is slightly too old to incorporate RFC2119 by
reference), has a rather stronger meaning than in normal English prose.
It approximately means "you have to do this unless you carefully weigh
the implications and conclude you have good reason to do otherwise".




More information about the Syslinux mailing list