Michael Starr
Michael Starr

Reputation: 7

Printing first instance of match in each line of file (Perl)

I have the following in an executable .pl file:

#!/usr/bin/env perl
$file = 'TfbG_peaks.txt';
open(INFO, $file) or die("Could not open file.");

foreach $line (<INFO>) {
        if ($line =~ m/[^_]*(?=_)/){
                #print $line; #this prints lines, which means there are matches
                print $1; #but this prints nothing
        }
}

Based on my reading at http://goo.gl/YlEN7 and http://goo.gl/VlwKe, print $1; should print the first match in each line, but it doesn't. Help!

Upvotes: 0

Views: 2466

Answers (2)

ikegami
ikegami

Reputation: 385897

$1 prints what the first capture ((...)) in the pattern captured.

Maybe you were thinking of

print $& if $line =~ /[^_]*(?=_)/;    # BAD

or

print ${^MATCH} if $line =~ /[^_]*(?=_)/p;   # 5.10+

But the following would be simpler (and work before 5.10):

print $1 if $line =~ /([^_]*)_/;

Note: You'll get a performance boost when the pattern doesn't match if you add a leading ^ or (?:^|_) (whichever is appropriate).

print $1 if $line =~ /^([^_]*)_/;

Upvotes: 0

raina77ow
raina77ow

Reputation: 106375

No, $1 should print the string saved by so-called capture groups (created by the bracketing construct - ( ... )). For example:

if ($line =~ m/([^_]*)(?=_)/){
   print $1; 
   # now this will print something, 
   # unless string begins from an underscore 
   # (which still matches the pattern, as * is read as 'zero or more instances')
   # are you sure you don't need `+` here?
}

The pattern in your original code didn't have any capture groups, that's why $1 was empty (undef, to be precise) there. And (?=...) didn't count, as these were used to add a look-ahead subexpression.

Upvotes: 2

Related Questions