[syslinux] COMBOOT API: Add calls for directory functions; Implement for FAT

Gene Cumm gene.cumm at gmail.com
Sat Jan 10 16:34:37 PST 2009


On Fri, Dec 5, 2008 at 3:47 PM, H. Peter Anvin <hpa at zytor.com> wrote:
> Jernej Simončič wrote:
>> On Thursday, December 4, 2008, 23:37:55, H. Peter Anvin wrote:
>>
>>> It doesn't, and -- this is the really sucky part -- in general it really
>>> can't.  However, we can probably hack up something that is Good
>>> Enough[TM], basically by inverting the first field mapping in the
>>> codepage array and returning '?' or some such for anything we don't
>>> recognize -- I think there is actually a particular character specced in
>>> the FAT specification (probably '?', but it might be '_').
>>
>> IIRC, Win9x used _ when it encountered filenames with characters
>> outside the active codepage (and wouldn't let you access these files,
>> unless you "repaired" the volume with scandisk, which destroyed the
>> filenames).
>>
>
> Right, of course... '?' is an illegal filename character in Windoze.
>
> Not that it matters all that much in this case -- those filenames will
> be inaccessible, because they won't match either way.
>
> There is a third option, which might be better, actually: if the
> filename contains a character we can't make out, return the shortname
> for that file.  The shortname will always be representable.
>
>        -hpa
>

Just to review (for my own clarification) the possibilities (when an
UTF-16 character is above 0x00FF):

-Truncate the wide part; Won't display name correctly for UTF-16
values above 0x00FF guaranteed (maybe more)
-A UTF-16 to 8-bit localized translation table pair (also reverse
direction); Could easily take up to 64kiB(forward; could be shrunk if
a length specified)+512B(reverse) of memory and might not work
reliably to translate 8-bit character to the same UTF-16 character,
especially if the same character could be represented as two different
UTF-16 values(I'm not familiar enough with UTF-16 to know if this
happens) or there is no 8-bit equivalent character.
-Simple translation (as Peter mentioned) and replace the
non-displayable character to another displayable character; ? is
illegal in FAT (and I think all others) but would easily show signs of
a foreign character.  _ is the behavior of Windows 9x but doesn't form
a matching name
-Skip it and go to the shortname; Always displayable; Most of the time
typeable (My VM's BIOS won't let me do ALT+224 but can do ALT+225);
won't mismatch; relatively simple to code (just finished it in 2
lines).

Based on this, I did the last, skip long name in favor of short name for now.

-Gene


More information about the Syslinux mailing list