[syslinux] COMBOOT API: Add calls for directory functions; Implement for FAT

Gene Cumm gene.cumm at gmail.com
Mon Jan 12 09:20:06 PST 2009


2009/1/10 Gene Cumm <gene.cumm at gmail.com>:
> On Fri, Dec 5, 2008 at 3:47 PM, H. Peter Anvin <hpa at zytor.com> wrote:
>> Not that it matters all that much in this case -- those filenames will
>> be inaccessible, because they won't match either way.
>>
>> There is a third option, which might be better, actually: if the
>> filename contains a character we can't make out, return the shortname
>> for that file.  The shortname will always be representable.
>>
>>        -hpa
>>
>
> Just to review (for my own clarification) the possibilities (when an
> UTF-16 character is above 0x00FF):
>

> -Skip it and go to the shortname; Always displayable; Most of the time
> typeable (My VM's BIOS won't let me do ALT+224 but can do ALT+225);
> won't mismatch; relatively simple to code (just finished it in 2
> lines).
>
> Based on this, I did the last, skip long name in favor of short name for now.
>

I decided to delve more into the depths and confusion of codepages and
UTF-16 (Windows Codepages, Windows OEM codepages, etc).  Unicode Chart
"Latin-1", representing U+0080 through U+00FF has pretty much no
corellation to my localized codepage (OEM-437 as that appears to be
what is used in my BIOS).  Therefore, in the interest of providing
accurate information in the displayed name, I will also skip out
U+0080 and greater in favor of the short name as it should be more
accurate (and hopefully 100% accurate).  UTF-16 below U+0080, by
definition, should be consistent, regardless of the codepage the BIOS
uses.  If I am mistaken in any of these assumptions, feedback is
welcome.

On Sat, Jan 10, 2009 at 4:34 PM, Jernej Simončič <jernej at ena.si> wrote:
> I've got a VM with Slovenian Windows 98 running at work (uses codepage
> 1250), if you're interested, I could create a floppy image with some
> of our national characters. Though I'm not sure if they'll cause any
> problems, or only show up as different characters on US (cp1252)
> version.
>

Theoretically, the short names should just show up as whatever
character my codepage is set to display for a binary character value
(the purpose of codepages, I believe) but the long names will either
display identically or those characters will probably show up as
'_'(Windows behavior for unknown characters).  I do have a sort of
intellectual/educational curiosity at what the resulting binary values
in the long and short names end up and how they display in my VM.

>
> I know that there were no problems creating files with national
> characters back in the DOS days (at least after the switch to CP852 -
> the original "standard" mapped the national characters (among others)
> to | and \, so these couldn't be used - and yes, DOS prompt in those
> days looked like C:Đ> here :)
>

I'm hoping you mean the keyboard mapping and not the codepage mapping
as I haven't seen any codepage that has that.  Were you able to input
these characters directly or did you have to do keyboard shortcuts
(including ALT+NUM)?

-Gene


More information about the Syslinux mailing list