Reputation: 5831
I want to be able to take an argument from the command line and use it as a regular expression within my script to filter lines from my file. A simple example
$ perl script.pl id_4
In script.pl:
...
my $exp = shift;
while(my $line = <$fh){
if($line =~ /$exp/){
print $line,"\n";
}
}
...
My actual script is a bit more complicated and does other manipulations to the line to extract information and produce a different output. My problem is that I have situations where I want to filter out every line that contains "id_4" instead of only select lines containing "id_4". Normally this could be achieved by
if($line !~ /$exp/)
but, if possible, I don't want to alter my script to accept a more complex set of arguments (e.g. use !~
if second parameter is "ne", and =~
if not).
Can anyone think of a regex that I can use (beside a long "id_1|id_2|id_3|id_5...") to filter out lines containing one particular value out of many possibilities? I fear I'm asking for the daft here, and should probably just stick to the sensible and accept a further argument :/.
Upvotes: 0
Views: 661
Reputation: 67918
Why choose? Have both.
my $exp = join "|", grep !/^!/, @ARGV;
my @not = grep /^!/, @ARGV;
s/^!// for @not;
my $exp_not = join "|", @not;
...
if (( $line =~ $exp ) && ( $line !~ $exp_not )) {
# do stuff
}
Usage:
perl script.pl orange soda !light !diet
Upvotes: 1
Reputation: 58598
There is a way to invert regular expressions, so you can do matches like "all strings which do not contain a match for subexpr
". Without the operators which express this directly (i.e. using only the basic positive-matching regex operators), it is still possible but leads to large and unwieldy regular expressions (possibly, combinatorial explosion in the regex size).
For a simple example, look at my answer to this question: how to write a regex which matches everything but the string "help". (It's a quite a simplification that the match is anchored to start and end.) Match all letter/number combos but specific word?
Traditional Unix tools have hacks for situations when you want to just invert the match of the expression as a whole: grep
versus grep -v
. Or vi
: :g/pat/
versus :v/pat/
, etc. In this way, the implementors ducked out implementing the difficult regex operators that don't fit into the simple NFA construction approach.
The easiest thing is to do the same thing and have a convention for coarse-grained negation: an include pattern and an exclude pattern.
Upvotes: 0