Mahmoud Emam
Mahmoud Emam

Reputation: 1527

Matching Regex in Perl

my program contains ascii.txt to match patterns from it. my program is to implement sed command, just to try write perl code because I am studying perl.

#!/usr/bin/perl
# sed command implementation
use strict;
use warnings;
use subs qw(read_STDIN read_FILE usage);
use IO::File;
use constant {
    SEARCH_PRINT => 0,
};

our $proj_name = $0;

main(@ARGV);

sub main
{
    if(scalar @_ == 2) {
        read_FILE @_;

    }
    else {
        usage 
    }
}

sub read_FILE {
    my ($sed_script, $file_name) = @_;
    my $parsed_val =  parse_sed_script($sed_script);
    if( $parsed_val == SEARCH_PRINT ) {
        search_print_lines($sed_script, $file_name);
    }
}

sub parse_sed_script {
    my $command = shift or return;
    if($command =~ /^\/([^\/].)*\/$/) {
        return SEARCH_PRINT;
    }
}

sub search_print_lines {
    my ($script, $file) = @_;
    my $fh = IO::File->new($file, "r") or error("no file found $file");
    while( $_ = $fh->getline ) {
        print if $_ =~ $script
    }
}

sub usage {
    message("Usage: $proj_name sed-script [file]")
}

sub error
{
    my $e = shift || 'unkown error';
    print("$0: $e\n");
    exit 0;
}

When I execute from the shell: sed.pl /Test/ ascii.txt

I found that print if $_ =~ $script, doesn't execute because of the REGEX is stored in scalar variable

the ascii.txt contains.

Test 1
REGEX TEST

When I use print $script in search_print_lines subroutine it prints the regex sent by the user

Upvotes: 1

Views: 189

Answers (2)

Jeremy
Jeremy

Reputation: 380

When you pass something in on the command line and use it in your script, the entire literal text is used. So if you pass in /Test/, it will see those slashes as literals, so the "real" regular expression it's looking at is something like \/Test\/ (escaping the slashes, because now it's looking for them. Try passing in the regex without the // surrounding it.

If your goal is to allow the // to show that it's a regular expression, I would remove them when the program starts.

One more edit: If you want to be able to pass in flags, you'd need to eval the input somehow.

$script = '/Test/i';
eval { "\$regex = $script" };

and then

"REGEX TEST" =~ $regex

should return true. Doing an eval like this is highly insecure, though.

edit: what happens in eval is that whatever's in the block is executed. So what happens in the eval above is that you're dynamically creating a regular expression and setting it to a variable. That allows you to use regular expression flags like i without having to do any special parsing of the command-line input. When the eval is executed, it will be as if you had typed in $regex = /Test/i. Then you can compare your text to $regex and it will work. I thought about this because your example would not work unless you had the i flag set to make the comparison case-insensitive.

Upvotes: 3

user4035
user4035

Reputation: 23749

You didn't remove the slashes from $sed_script variable. After I modified your read_FILE function, it started to work:

sub read_FILE {
    my ($sed_script, $file_name) = @_;
    my $parsed_val =  parse_sed_script($sed_script);

    if( $parsed_val == SEARCH_PRINT ) {
        $sed_script =~ s/^\/(.*)\/$/$1/;

        #you can also parse the regexp
        #$sed_script = qr/$sed_script/;
        search_print_lines($sed_script, $file_name);
    }
}

Upvotes: 1

Related Questions