[syslinux] codepage/UnicodeData: tcase() data

H. Peter Anvin hpa at zytor.com
Sun Jan 18 12:15:56 PST 2009


Gene Cumm wrote:
> Three questions:
> 
> - Where did the file come from?
> - Does tcase() stand for toggle case (or otherwise effectively the same thing)?
> - Should uppercase characters like, the latin capital A, have tcase()
> data in addition to the lcase() data?
> 

The file comes from the Unicode Consortium, ftp.unicode.org.  The full
file is *huge* (over a megabyte), so I have the mksubset.pl to cut it
down to only those bits needed.

tcase stands for "Title Case": UPPER CASE, lower case, Title Case.  It
matters for a handful of characters like:

U+01C4 LATIN CAPITAL LETTER DZ WITH CARON
U+01C5 LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
U+01C6 LATIN SMALL LETTER DZ WITH CARON

U+01C5 is title case.  I decided title case is so rare (and I'm not even
sure if we have *any* instances of it in any of the common codepages)
that adding it would be a waste of space.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.




More information about the Syslinux mailing list