GrSrv
GrSrv

Reputation: 559

Perl Regex issue

I was trying a random regex:

$string = "sajdk3:jdk3:jdk3:dgklmhij";
@arr = split(/([\da-z]+)([:;])\1\2\1/, $string);

# As per my understanding of RegEx, the given pattern shall match jdk3:jdk3:jdk3
# So @arr must contain two scalar values: 'sa' and ':dgklmhij'
# But when I printed @arr I got something else
print "Array: @arr\nNumber of items: ", scalar @arr;
#Array: sa jdk3 : :dgklmhij
#Number of items: 4

# So, I tried: 
$string =~ /([\da-z]+)([:;])\1\2\1/;
print "\n( $1 ) ( $2 )\n";
print "($`)($&)($') \n";
# ( jdk3 ) ( : ) (  ) (  ) (  ) 
# (sa)(jdk3:jdk3:jdk3)(:dgklmhij) 

Can someone explain why the array has 4 elements, instead of 2?

Okay, so after the explanation of @mpapec, I'm curious how to accomplish it. What shall one do when having a capturing group is unavoidable in the split param. Like you want to split a date which can be 12-05-92 or 26.11.87 or 07 04 92.

Upvotes: 3

Views: 235

Answers (1)

mpapec
mpapec

Reputation: 50637

Can someone explain why the array has 4 elements, instead of 2?

You're capturing delimiters as well when splitting, so there are 4 elements instead of 2.

From http://perldoc.perl.org/functions/split.html

If the PATTERN contains capturing groups, then for each separator, an additional field is produced for each substring captured by a group...

Upvotes: 3

Related Questions