Reputation: 20300
I want to split a string using regular expressions but I have run into some problem. I have this string:
$text=" one two three";
Then I try to split it into alphabetic words:
#@words=split(" ", $text); #1 this works
@words=split("[^a-zA-Z]", $text); #2 this doesn't work
for $word (@words){
printf "word: |$word|\n";
}
So the commented method(1) works fine. As expected I get printed:
word: |one|
word: |two|
word: |three|
However with the second method(2) I get this:
word: ||
word: |one|
word: |two|
word: |three|
So although logically the second method should be equivalent to the first one, in practice it doesn't behave the same way. Why is that?
Upvotes: 2
Views: 196
Reputation: 336108
This is a special case in Perl's split()
function.
As stated in perldoc:
split(/PATTERN/, expr, [limit])
If PATTERN is omitted, [it] splits on whitespace (after skipping any leading whitespace).
Empty leading fields are produced when there are positive-width matches at the beginning of the string; [...]
As a special case, specifying a PATTERN of space (
' '
) will split on white space just as split with no arguments does. Thus,split(' ')
can be used to emulate awk's default behavior, whereassplit(/ /)
will give you as many initial null fields (empty string) as there are leading spaces.
Upvotes: 10