Krab
Krab

Reputation: 6756

Perl - positions of regex match in string

if (my @matches = $input_string =~ /$metadata[$_]{"pattern"}/g) {
  print $-[1] . "\n"; # this gives me error uninitialized ...
}

print scalar @matches; gaves me 4, that is ok, but if i use $-[1] to get start of first match, it gaves me error. Where is problem?

EDIT1: How i can get positions of each match in string? If i have string "ahoj ahoj ahoj" and regexp /ahoj/g, how i can get positions of start and end of each "ahoj" in string?

Upvotes: 3

Views: 1644

Answers (2)

user3408541
user3408541

Reputation: 63

To find multiple match positions in a string you have to use a while loop. The @- array contains the start position of the match in the current iteration. The builtin function pos will return the ending position of the most recent match in the string by calling

pos($string)

The code would look something like this.

#!/usr/bin/perl -w

my @matchedStrings;                #array to keep all matched strings
my @matchedStartingPositions; #array to keep all starting positions
my @matchedEndingPositions;  #array to keep all ending positions

my $string = "ahoj ahoj ahoj";
print "Search string is: \"$string\"\ \n\n";

while ($string =~ /(ahoj)/g) {
  if($1){ #if backreference was captured
    my $stringMatch = $1;
    my $startingPositionOfMatch = $-[0];          #this is a builtin special variable like $_
    my $endingPositionOfMatch = pos($string);#this is a builtin special function like split or join
    print "Match \"$stringMatch\" found starting at position: $startingPositionOfMatch, ending at position: $endingPositionOfMatch\n";
    
    #store data in array for later reference
    push @matchedStrings, $stringMatch;
    push @matchedStartingPositions, $startingPositionOfMatch;
    push @matchedEndingPositions, $endingPositionOfMatch;
  }
}
print "\n";

#the three arrays should all be equal in length
print "Stored data on all matches\n";
print "Match:\t" . join("\t",@matchedStrings) . "\n";
print "SPos:\t" . join("\t",@matchedStartingPositions) . "\n";
print "EPos:\t" . join("\t",@matchedEndingPositions) . "\n";

Output looks like this

$perl matches.pl
Search string is: "ahoj ahoj ahoj" 

Match "ahoj" found starting at position: 0, ending at position: 4
Match "ahoj" found starting at position: 5, ending at position: 9
Match "ahoj" found starting at position: 10, ending at position: 14

Stored data on all matches
Match:  ahoj    ahoj    ahoj
SPos:    0         5        10
EPos:    4         9        14

Upvotes: 0

Borodin
Borodin

Reputation: 126722

The array @- contains the offset of the start of the last successful match (in $-[0]) and the offset of any captures there may have been in that match (in $-[1], $-[2] etc.).

There are no captures in your string, so only $-[0] is valid, and (in your case) the last successful match is the fourth one, so it will contain the offset of the fourth instance of the pattern.

The way to get the offsets of individual matches is to write

my @matches;
while ("ahoj ahoj ahoj" =~ /(ahoj)/g) {
  push @matches, $1;
  print $-[0], "\n";
}

output

0
5
10

Or if you don't want the individual matched strings, then

my @matches;
push @matches, $-[0] while "ahoj ahoj ahoj" =~ /ahoj/g;

Upvotes: 9

Related Questions