sji
sji

Reputation: 1907

How to pass a replacing regex as a command line argument to a perl script

I am trying to write a simple perl script to apply a given regex to a filename among other things, and I am having trouble passing a regex into the script as an argument.

What I would like to be able to do is somthing like this:

> myscript 's/hi/bye/i' hi.h
bye.h
>

I have produced this code

#!/utils/bin/perl -w
use strict;
use warnings;

my $n_args = $#ARGV + 1;
my $regex =  $ARGV[0];
for(my $i=1; $i<$n_args; $i++) {
  my $file = $ARGV[$i];

  $file =~ $regex;
  print "OUTPUT: $file\n";
}

I cannot use qr because apparently it cannot be used on replacing regexes (although my source for this is a forum post so I'm happy to be proved wrong).

I would rather avoid passing the two parts in as seperate strings and manually doing the regex in the perl script.

Is it possible to pass the regex as an argument like this, and if so what is the best way to do it?

Upvotes: 6

Views: 4878

Answers (4)

raina77ow
raina77ow

Reputation: 106385

There's more than one way to do it, I think.

The Evial Way:

As you basically send in a regex expression, it can be evaluated to get the result. Like this:

my @args = ('s/hi/bye/', 'hi.h');
my ($regex, @filenames) = @args;
for my $file (@filenames) {
  eval("\$file =~ $regex");
  print "OUTPUT: $file\n";
}

Of course, following this way will open you to some very nasty surprises. For example, consider passing this set of arguments:

...
my @args = ('s/hi/bye/; print qq{MINE IS AN EVIL LAUGH!\n}', 'hi.h');
...

Yes, it will laugh at you most evailly.

The Safe Way:

my ($regex_expr, @filenames) = @args;
my ($substr, $replace) = $regex_expr =~ m#^s/((?:[^/]|\\/)+)/((?:[^/]|\\/)+)/#;
for my $file (@filenames) {
  $file =~ s/$substr/$replace/;
  print "OUTPUT: $file\n";
}

As you can see, we parse the expression given to us into two parts, then use these parts to build a full operator. Obviously, this approach is less flexible, but, of course, it's much more safe.

The Easiest Way:

my ($search, $replace, @filenames) = @args;
for my $file (@filenames) {
  $file =~ s/$search/$replace/;
  print "OUTPUT: $file\n";
}

Yes, that's right - no regex parsing at all! What happens here is we decided to take two arguments - 'search pattern' and 'replacement string' - instead of a single one. Will it make our script less flexible than the previous one? No, as we still had to parse the regex expression more-or-less regularly. But now user clearly understand all the data that is given to a command, which is usually quite an improvement. )

@args in both examples corresponds to @ARGV array.

Upvotes: 9

TLP
TLP

Reputation: 67900

The trouble is that you are trying to pass a perl operator when all you really need to pass is the arguments:

myscript hi bye hi.h

In the script:

my ($find, $replace, @files) = @ARGV;
...
$file =~ s/$find/$replace/i;

Your code is a bit clunky. This is all you need:

use strict;
use warnings;

my ($find, $replace, @files) = @ARGV;
for my $file (@files) {
    $file =~ s/$find/$replace/i;
    print "$file\n";
}

Note that this way allows you to use meta characters in the regex, such as \w{2}foo?. This can be both a good thing and a bad thing. To make all characters intepreted literally (disable meta characters), you can use \Q ... \E like so:

... s/\Q$find\E/$replace/i;

Upvotes: 2

Ian Roberts
Ian Roberts

Reputation: 122364

The s/a/b/i is an operator, not simply a regular expression, so you need to use eval if you want it to be interpreted properly.

#!/usr/bin/env perl

use warnings;
use strict;

my $regex = shift;
my $sub = eval "sub { \$_[0] =~ $regex; }";

foreach my $file (@ARGV) {
    &$sub($file);
    print "OUTPUT: $file\n";
}

The trick here is that I'm substituting this "bit of code" into a string to produce Perl code that defines an anonymous subroutine $_[0] =~ s/a/b/i; (or whatever code you pass it), then using eval to compile that code and give me a code reference I can call from within the loop.

$ test.pl 's/foo/bar/' foo nicefood
OUTPUT: bar
OUTPUT: nicebard

$ test.pl 'tr/o/e/' foo nicefood
OUTPUT: fee
OUTPUT: nicefeed

This is more efficient than putting an eval "\$file =~ $regex;" inside the loop as then it'll get compiled and eval-ed at every iteration rather than just once up-front.

A word of warning about eval - as raina77ow's answer explains, you should avoid eval unless you're 100% sure you are always getting your input from a trusted source...

Upvotes: 4

choroba
choroba

Reputation: 241808

s/a/b/i is not a regex. It is a regex plus substitution. Unless you use the string eval, make this work might be pretty tough (consider s{a}<b>e and so on).

Upvotes: 2

Related Questions