[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]

use bytes pragma



Glenn Linderman wrote:
> On approximately 10/7/2008 7:05 AM, came the following characters from 
> the keyboard of David Nicol:
>> On Mon, Oct 6, 2008 at 11:04 PM, Glenn Linderman <perl@nevcal.com> wrote:
>>>  \w is meaningless on binary data, for example, although character 
>>> classes (could
>>> be called byte classes) could still be useful without character 
>>> semantics.
>>
>> Lets say one is faced with a legacy delimited file that uses 0xFF for
>> a separator.  Running
>>
>>    use bytes;
>>    @strings = $data =~ /(\w+)/g;
>>
>> could be handy.
> 
> 
> I guess your legacy delimited file is intended to be ASCII text, with 
> each string delimited by 0xFF, but that is only a guess, since you 
> didn't make it clear.
> 
> @strings = split ( /\xFF/, $data )
> 
> would do the same job, be independent of "use bytes;", and allow for 
> punctuation and control characters in the @strings.  You didn't state 
> that the @strings should contain only alphanumerics, but your code does. 
>  Of course, even if the @strings are supposed to only contain 
> alphanumerics, your code would treat punctuation and control characters 
> as additional delimiters and not only ignore the error case, but make it 
> impossible to detect without reexamining $data.  My code would treat 
> only \xFF as delimiters (per your specification), and then additional 
> code could be written to check the resulting @strings for validity as 
> appropriate.
> 
> You'll need to contrive a more useful, and more completely specified 
> example to be convincing.
> 
> 

I don't want to get embroiled at this time with changes to this pragma. 
  I notice that in uniintro it says "...the "bytes" pragma and its only
defined function "length()"..."  If I understand this correctly, it 
supports Glenn's position as to the original intent of this pragma, but 
it is also wrong, as there are a number of other functions defined on 
it, such as substr().

The code I have looked at is permeated with hooks to make this pragma 
work as if one were in the C locale.

So I think it is unwise to change it at this time.  I wonder how often 
this pragma is used in practice.


References to:
karl williamson <public@khwilliamson.com>
Glenn Linderman <perl@NevCal.com>
"David Nicol" <davidnicol@gmail.com>
Glenn Linderman <perl@NevCal.com>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]