karate_kid
karate_kid

Reputation: 155

Perl Wildcards-Regex

I am trying to read from the file. Here is what my files looks like..

  abc123
      abdef012
    fedabc_23
        xyz12
  12345

Now, what I am trying to do is, getting an option from command line and according to the entered wildchar like, *, ?, + appropriate lines from above file should be printed. But I am stuck here. I know. How * works, but not sure about other wildchars.. Please help me.

 #/perl/bin/perl
 use Getopt::Long;
 open (DATA, "filname.txt") || die "Can't open the file:$!";

 my $fil='';

 my $res= GetOptions (
"f=s" =>\$fil
);
$fil=~ s/[\*]//g;  #Works only if '*' is at the end

 /(\w*$fil\w*)/ && !$seen{$1}++ && push @arr, $1 while <DATA>;

How to use other wildcards also? How to generalize this?

Upvotes: 1

Views: 5762

Answers (2)

David W.
David W.

Reputation: 107060

Let me get this straight:

You have a file, and you want to input a regular expression, and print out all lines that match that expression? Something like grep?

use strict;
use warnings;
use autodie;

my $regex = shift;
my $file  = shift;

open my $fh, "<", $file;  #Autodie will handle not being able to open files...
while ( my $line = <$fh> ) {
    print $line if $line =~ /$regex/;
}
close $fh;

Or, are you trying to use globbing and not regular expressions?

There's a Perl module called Text::Glob that will match on globs or convert a glob into a regular expression.

I never used it, but it appears pretty simple:

use strict;
use warnings;
use autodie;
use Text::Glob qw(match_glob);

my $glob = shift;
my $file  = shift;

open my $fh, "<", $file;  #Autodie will handle not being able to open files...
while ( my $line = <$fh> ) {
    print $line if match_glob( $glob, $line );
}
close $fh;

Upvotes: 3

Ro Yo Mi
Ro Yo Mi

Reputation: 15010

The symbol * means 0 or more of the preceding character, so d*x would match ddddddddx or dx or ddx.

The symbol + means to match 1 or more of the preceding character, sot d+x will also match ddddddx or dx or ddx

Square brackets define a character class so [\*] means to match either a back slash or the * symbol. Many of the special characters in regex lose their meaning while inside a square bracket character class. So [\*]x would match \x or *x.

The ? means to match the preceding character 0 or 1 time. So d?x would match dx or x

The . matches any character.

These ideas can be combined so to match any character between quotes you could use '.*' which would find all the characters between the first quote in the string and the last quote in the string (including any quotes in between). Or to match just the text between two quotes you could make the * non-greedy by including a ? as in '.*?'.

You can read more about how these possessive quantifiers work over at http://www.regular-expressions.info/possessive.html.

Upvotes: 1

Related Questions