[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]
Need advice on gotchas on upgrading unicode db to 5.1
I'm working to bring the unicode database in perl up to the latest, 5.1.
Parts of it were upgraded earlier, but not all, and that presents a
problem.
In particular the Property Value Aliases were not upgraded. These
include short names for various properties. There were a number of new
scripts defined for 5.1, such as for the Lydian language. mktables
looks for the abbreviation for a property and creates a file using that
name. If no abbreviation is found, it uses the full name. This means
that all the new scripts in 5.1 have been stored using their full names,
instead of their accepted abbreviations since our list of abbreviations
was out-of-date. By bringing the abbreviations up to date, mktables
generates a file using those instead of the full names. Same content,
different name. So, for example, it generates Lydi.pl instead of
Lydian.pl.
Since mktables.lst was not generated, they haven't been listed anywhere,
and since the documentation was not upgraded, they aren't documented.
But, it is possible I suppose for some programmer to have noticed their
existence way down in lib/unicore/lib/gc_sc, and is using them. I don't
recall these as being documented as a public interface. So, I want to
know, is it ok to change their names, or should I create duplicate files
for these 8 files, or something else?
Related, is that Unicode has decided at 5.1 to capitalize their
preferred names for decomposition types, which we store in
lib/unicore/lib/dt. This means that those file names would also change,
for example from sqr.pl to Sqr.pl. I can hack up mktables to always do
a lower case for these, but is it necessary?
Thanks
- Follow-Ups from:
-
demerphq <demerphq@gmail.com>
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]