[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]
odd behaviour of substr(lc($utf8)) in 5.8.8
Hi folks,
I don't know if this is a bug but it sure is odd - I'm more of a perl
user than core developer and I haven't found anyone or any documents
that can explain this one.
If it's been fixed recently please point me towards the change logs!
(see below for perl -V output)
--------------------------------------------------Test case:
#!/usr/bin/perl
use utf8; # so we can add UTF-8 chars below
use strict;
use warnings;
my @words=split(' ','Some £long £piece of£ text containing UTF-8, A
Really£ Long£ £Line £of órandom nicé and fruití words which have VERY
LITTLE MEANING and are ALL JUMBLED UP by the code, this should be
interesting to watch');
for(my $i=0;$i<10;$i++){
# generate a random string from the words above:
my $long_string='';
for(my $n=0;$n<6;$n++){
$long_string.=$words[rand(scalar(@words))].' ';
}
my $short1=substr(lc($long_string),0,256);
my $short2=lc($long_string);
$short2=substr($short2,0,256); # in theory should be == $short1
if($short1 ne $short2){
print "DIFFERENT at iteration
$i:\n\t\tshort1=$short1\n\t\tshort2=$short2\n";
}
}
print "\n";
---------------------------------------------------------------------------
Try running it a few times - on various machines I've tried, this will
show $short1 being randomly truncated - sometimes not at all in the 10
iterations - on no particular boundary, and $short2 exactly as
expected. It's specific to the construct substr( lc( $utf8_text ), n, n ).
----------------------------------------------------------------------------
Perl -V output:
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
Platform:
osname=linux, osvers=2.6.15.7, archname=x86_64-linux-gnu-thread-multi
uname='linux king 2.6.15.7 #1 smp sun sep 23 13:51:52 utc 2007
x86_64 gnulinux '
config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN
-Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr
-Dprivlib=/usr/share/perl/5.8 -Darchlib=/usr/lib/perl/5.8
-Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5
-Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local
-Dsitelib=/usr/local/share/perl/5.8.8
-Dsitearch=/usr/local/lib/perl/5.8.8 -Dman1dir=/usr/share/man/man1
-Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1
-Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl
-Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm
-Duseshrplib -Dlibperl=libperl.so.5.8.8 -Dd_dosuid -des'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=define use64bitall=define uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-DDEBIAN -fno-strict-aliasing -pipe -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBIAN
-fno-strict-aliasing -pipe -I/usr/local/include'
ccversion='', gccversion='4.1.2 (Ubuntu 4.1.2-0ubuntu4)',
gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt
perllibs=-ldl -lm -lpthread -lc -lcrypt
libc=/lib/libc-2.5.so, so=so, useshrplib=true, libperl=libperl.so.5.8.8
gnulibc_version='2.5'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_64_BIT_ALL
USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
USE_PERLIO USE_REENTRANT_API
Built under linux
Compiled at Dec 4 2007 09:01:45
@INC:
/etc/perl
/usr/local/lib/perl/5.8.8
/usr/local/share/perl/5.8.8
/usr/lib/perl5
/usr/share/perl5
/usr/lib/perl/5.8
/usr/share/perl/5.8
/usr/local/lib/site_perl
---------------------------------------------------------------------
any advice appreciated,
John
- Follow-Ups from:
-
Nicholas Clark <nick@ccl4.org>
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]