scott
scott

Reputation:

How do I find the index location of a substring matched with a regex in Perl?

I am iterating through a file and on each line I am looking for a regex. If the regex is found I just want to print "it's found" and then the index location of where it was found in that line.

Example:

looking for: 'HDWFLSFKD' need index between two Ds
line: MLTSHQKKF*HDWFLSFKD*SNNYNSKQNHSIKDIFNRFNHYIYNDLGIRTIA
output: 'its found' index location: 10-17

The above 'looking for' is quite simple but I am planning to have complex regex statements in there.
So basically Just want to know if a regex is found in a string then how do we get the index location of it?

Here is the code I have so far:

foreach my $line (@file_data)
{
        if ($line=~ /HDWFLSFKD/){
            print "it's found\n"; 
            print "but at what index are the two Ds";
          }   
        else {
            $sequence.=$line;
            print "came in else\n";
        }
}

Upvotes: 4

Views: 6081

Answers (2)

Chas. Owens
Chas. Owens

Reputation: 64909

I believe you are looking for pos:

 #!/usr/bin/perl

use strict;
use warnings;

my $sequence;
while (my $line = <DATA>) {
    if ($line=~ /(HDWFLSFKD)/g){
        print "its found index location: ", 
            pos($line)-length($1), "-",  pos($line), "\n";
    } else {
        $sequence .= $line;
        print "came in else\n";
    }
}

__DATA__
MLTSHQKKF*HDWFLSFKD*SNNYNSKQNHSIKDIFNRFNHYIYNDLGIRTIA
MLTSHQKKFSNNYNSKQNHSIKDIFNRFNHYIYNDLGIRTIA
MLTSHQKKFSNNYNSK*HDWFLSFKD*QNHSIKDIFNRFNHYIYNDLGIRTIA

You can also use the @- and @+ variables:

#!/usr/bin/perl

use strict;
use warnings;

my $sequence;
while (my $line = <DATA>) {
        if ($line=~ /HDWFLSFKD/){
                print "its found index location: $-[0]-$+[0]\n";
        } else {
                $sequence .= $line;
                print "came in else\n";
        }
}

__DATA__
MLTSHQKKF*HDWFLSFKD*SNNYNSKQNHSIKDIFNRFNHYIYNDLGIRTIA
MLTSHQKKFSNNYNSKQNHSIKDIFNRFNHYIYNDLGIRTIA
MLTSHQKKFSNNYNSK*HDWFLSFKD*QNHSIKDIFNRFNHYIYNDL

Upvotes: 13

weismat
weismat

Reputation: 7411

You could split your string with the regex and output the size of the first array element, if there are more than one elemnts in the array. A simple sample:

my $test="123;456";
my @help=split(';', $test);
if ($#help>0) {
    print "Index is:".length($help[0]);
}

Edit: This fits to your simple example, but not fully with your text - if the regex gets more complex, then the size of the split criteria gets flexible again. Then you need to determine the index of the second element of the array to determine the size of the split criteria.

Upvotes: 0

Related Questions