[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]

[perl #59168] 5.8.x: regexp bug with (?!)



# New Ticket Created by  Father Chrysostomos 
# Please include the string:  [perl #59168]
# in the subject line of all future correspondence about this issue. 
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=59168 >


perl 5.8.9-to-be (34300) has an elusive regexp bug involving  
backtracking, negative look-aheads and back-references. Look at the  
output from this program, with 5.8.8 (same in 5.10 and blead [34308]),  
and with maint:

use Data::Dumper;

baaabaac =~ /(.*?)a(?!(a+)b\2(?{print "$-[0] $2\n"})c)/;

print Dumper [$&,$1];
print Dumper [@-[0,1]];

$ perl5.8.8 prog
0 aa
0 a
$VAR1 = [
           'baa',
           'ba',
         ];
$VAR1 = [
           '0',
           '0',
         ];

$ /opt/maint/bin/perl prog
0 aa
1 aa
$VAR1 = [
           'a',
           '',
         ];
$VAR1 = [
           '2',
           '2',
         ];

It appears that, after the first attempt at matching the contents of  
(?!...) succeeds (causing a failure), it doesn’t backtrack and try  
alternative ways of matching the (.*?). Instead it reverses all the  
way back and tries matching the entire regexp from char 1.

The output from -Mre=debug and perlbug -d follows:


Compiling REx `(.*?)a(?!(a+)b\2c)'
size 27 Got 220 bytes for offset annotations.
first at 4
synthetic stclass "ANYOF[\0-\11\13-\377{unicode_all}]".
    1: OPEN1(3)
    3:   MINMOD(4)
    4:   STAR(6)
    5:     REG_ANY(0)
    6: CLOSE1(8)
    8: EXACT <a>(10)
   10: UNLESSM[-0](27)
   12:   OPEN2(14)
   14:     PLUS(17)
   15:       EXACT <a>(0)
   17:   CLOSE2(19)
   19:   EXACT <b>(21)
   21:   REF2(23)
   23:   EXACT <c>(25)
   25:   SUCCEED(0)
   26:   TAIL(27)
   27: END(0)
floating "a" at 0..2147483647 (checking floating) stclass "ANYOF[\0- 
\11\13-\377{unicode_all}]" minlen 1
Offsets: [27]
	1[1] 0[0] 4[1] 3[1] 2[1] 5[1] 0[0] 6[1] 0[0] 10[8] 0[0] 10[1] 0[0]  
12[1] 11[1] 0[0] 13[1] 0[0] 14[1] 0[0] 15[2] 0[0] 17[1] 0[0] 17[0]  
17[0] 19[0]
Guessing start of match, REx "(.*?)a(?!(a+)b\2c)" against "baaabaac"...
Found floating substr "a" at offset 1...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx "(.*?)a(?!(a+)b\2c)" against "baaabaac"
Matching stclass "ANYOF[\0-\11\13-\377{unicode_all}]" against "baaabaac"
   Setting an EVAL scope, savestack=3
    0 <> <baaabaac>        |  1:  OPEN1
    0 <> <baaabaac>        |  3:  MINMOD
    0 <> <baaabaac>        |  4:  STAR
   Setting an EVAL scope, savestack=3
                            REG_ANY can match 1 times out of 1...
    1 <b> <aaabaac>        |  6:    CLOSE1
    1 <b> <aaabaac>        |  8:    EXACT <a>
    2 <ba> <aabaac>        | 10:    UNLESSM[-0]
    2 <ba> <aabaac>        | 12:      OPEN2
    2 <ba> <aabaac>        | 14:      PLUS
                            EXACT <a> can match 2 times out of  
2147483647...
   Setting an EVAL scope, savestack=3
    4 <baaa> <baac>        | 17:        CLOSE2
    4 <baaa> <baac>        | 19:        EXACT <b>
    5 <baaab> <aac>        | 21:        REF2
    7 <baaabaa> <c>        | 23:        EXACT <c>
    8 <baaabaac> <>        | 25:        SUCCEED
                                   could match...
                               failed...
                            REG_ANY can match 0 times out of 1...
                             failed...
   Setting an EVAL scope, savestack=3
    1 <b> <aaabaac>        |  1:  OPEN1
    1 <b> <aaabaac>        |  3:  MINMOD
    1 <b> <aaabaac>        |  4:  STAR
   Setting an EVAL scope, savestack=3
    1 <b> <aaabaac>        |  6:    CLOSE1
    1 <b> <aaabaac>        |  8:    EXACT <a>
    2 <ba> <aabaac>        | 10:    UNLESSM[-0]
    2 <ba> <aabaac>        | 12:      OPEN2
    2 <ba> <aabaac>        | 14:      PLUS
                            EXACT <a> can match 2 times out of  
2147483647...
   Setting an EVAL scope, savestack=3
    4 <baaa> <baac>        | 17:        CLOSE2
    4 <baaa> <baac>        | 19:        EXACT <b>
    5 <baaab> <aac>        | 21:        REF2
    7 <baaabaa> <c>        | 23:        EXACT <c>
    8 <baaabaac> <>        | 25:        SUCCEED
                                   could match...
                               failed...
                            REG_ANY can match 0 times out of 1...
                             failed...
   Setting an EVAL scope, savestack=3
    2 <ba> <aabaac>        |  1:  OPEN1
    2 <ba> <aabaac>        |  3:  MINMOD
    2 <ba> <aabaac>        |  4:  STAR
   Setting an EVAL scope, savestack=3
    2 <ba> <aabaac>        |  6:    CLOSE1
    2 <ba> <aabaac>        |  8:    EXACT <a>
    3 <baa> <abaac>        | 10:    UNLESSM[-0]
    3 <baa> <abaac>        | 12:      OPEN2
    3 <baa> <abaac>        | 14:      PLUS
                            EXACT <a> can match 1 times out of  
2147483647...
   Setting an EVAL scope, savestack=3
    4 <baaa> <baac>        | 17:        CLOSE2
    4 <baaa> <baac>        | 19:        EXACT <b>
    5 <baaab> <aac>        | 21:        REF2
    6 <baaaba> <ac>        | 23:        EXACT <c>
                                   failed...
                                 failed...
    3 <baa> <abaac>        | 27:    END
Match successful!
Freeing REx: `"(.*?)a(?!(a+)b\\2c)"'


---
Flags:
     category=core
     severity=high
---
Site configuration information for perl v5.8.8:

Configured by sprout at Wed Sep 17 21:26:05 PDT 2008.

Summary of my perl5 (revision 5 version 8 subversion 8 patch 34376)  
configuration:
   Platform:
     osname=darwin, osvers=9.4.0, archname=darwin-2level
     uname='darwin pint.local 9.4.0 darwin kernel version 9.4.0: mon  
jun 9 19:30:53 pdt 2008; root:xnu-1228.5.20~1release_i386 i386 '
     config_args='-de -Dprefix=/opt/maint/'
     hint=recommended, useposix=true, d_sigaction=define
     usethreads=undef use5005threads=undef useithreads=undef  
usemultiplicity=undef
     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
     use64bitint=undef use64bitall=undef uselongdouble=undef
     usemymalloc=n, bincompat5005=undef
   Compiler:
     cc='cc', ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -fno- 
strict-aliasing -pipe -I/usr/local/include',
     optimize='-O3',
     cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp- 
precomp -fno-strict-aliasing -pipe -I/usr/local/include'
     ccversion='', gccversion='4.0.1 (Apple Inc. build 5484)',  
gccosandvers=''
     intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
     d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
     ivtype='long', ivsize=4, nvtype='double', nvsize=8,  
Off_t='off_t', lseeksize=8
     alignbytes=8, prototype=define
   Linker and Libraries:
     ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -L/usr/ 
local/lib'
     libpth=/usr/local/lib /usr/lib
     libs=-ldbm -ldl -lm -lutil -lc
     perllibs=-ldl -lm -lutil -lc
     libc=/usr/lib/libc.dylib, so=dylib, useshrplib=false,  
libperl=libperl.a
     gnulibc_version=''
   Dynamic Linking:
     dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
     cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/ 
usr/local/lib'

Locally applied patches:
     MAINT34300

---
@INC for perl v5.8.8:
     /opt/maint/lib/perl5/5.8.8/darwin-2level
     /opt/maint/lib/perl5/5.8.8
     /opt/maint/lib/perl5/site_perl/5.8.8/darwin-2level
     /opt/maint/lib/perl5/site_perl/5.8.8
     .

---
Environment for perl v5.8.8:
     DYLD_LIBRARY_PATH (unset)
     HOME=/Users/sprout
     LANG=en_US.UTF-8
     LANGUAGE (unset)
     LD_LIBRARY_PATH (unset)
     LOGDIR (unset)
     PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/ 
usr/local/bin
     PERL_BADLANG (unset)
     SHELL=/bin/bash


Follow-Ups from:
Nicholas Clark <nick@ccl4.org>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]