[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]
[perl #59616] FOLDCHAR regop not produced for \x, \0, \N{U+....}
# New Ticket Created by karl williamson
# Please include the string: [perl #59616]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=59616 >
The FOLDCHAR regop is used for case insensitive matching of 3
characters, including LATIN SMALL LETTER SHARP S. The problem is that
it does not get applied if those characters are placed in the pattern
using hex, octal, or the character name format with the name of the form
\N{U+...}
From reading the list archive about this, it appears that people
thought about the hex form, but from my reading the code and doing the
following experiment, it doesn't work.
use charnames ':full';
use utf8;
print __LINE__, " ", ("ss" =~ /ß/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("ss" =~ /\xdf/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("\x100ss" =~ /\x100ß/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("\x100ss" =~ /\x100\xdf/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("\x100ss" =~ /\x100\337/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("\x100ss" =~ /\x100\N{U+00DF}/i ? "yes" : "no"), "\n";
print __LINE__, " ", ("\x100ss" =~ /\x100\N{LATIN SMALL LETTER SHARP
S}/i ? "yes" : "no"), "\n";
yields on my computer:
3 yes
4 no
5 yes
6 no
7 no
8 no
9 yes
The no on line 4 comes from the current situation that something must be
stored as utf8 in order to enable the full unicode semantics. But the
no's on lines 6-8 shouldn't be there, because the \x100 means that the
pattern should be stored as utf8. I haven't analyzed why using the full
character name should work, when the U+ form does not.
I'm using
Summary of my perl5 (revision 5 version 11 subversion 0 patch 34418)
configuration:
Platform:
osname=linux, osvers=2.6.24-19-generic, archname=i686-linux
uname='linux karl 2.6.24-19-generic #1 smp wed aug 20 22:56:21 utc
2008 i686 gnulinux '
config_args=''
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=undef, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include'
ccversion='', gccversion='4.2.3 (Ubuntu 4.2.3-2ubuntu7)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.7'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib
-fstack-protector'
Characteristics of this binary (from libperl):
Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
USE_LARGE_FILES USE_PERLIO
Locally applied patches:
DEVEL
Built under linux
Compiled at Sep 25 2008 12:31:18
@INC:
/home/khw/perl5.11/lib/5.11.0/i686-linux
/home/khw/perl5.11/lib/5.11.0
/home/khw/localperl/lib/site_perl/5.11.0/i686-linux
/home/khw/localperl/lib/site_perl/5.11.0
/home/khw/localperl/lib/site_perl
.
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]