Reputation: 995
I am splitting a text file into blocks in order to extract those blocks which do not contain a certain line by using a regular expression. The text file looks like this:
[Term]
id: id1
name: name1
xref: type1:aab
xref: type2:cdc
[Term]
id: id2
name: name2
xref: type1:aba
xref: type3:fee
Someone helped me a few days ago by showing me how to extract those blocks which do contain a certain regular expression (for example "xref: type3"):
while (<MYFILE>) {
BEGIN { $/ = q|| }
my @lines = split /\n/;
for my $line ( @lines ) {
if ( $line =~ m/xref:\s*type3/ ) {
printf NEWFILE qq|%s|, $_;
last;
}
}
}
Now I want to write all blocks in a new file which do not contain "xref: type3". I tried to do this by simply negating the regex
if ( $line !~ m/xref:\s*type3/ )
or alternatively by negating the if statement by using
unless ( $line =~ m/xref:\s*type3/ )
Unfortunately it doesn't work - the output file is the same as the the original one. Any ideas what I'm doing wrong?
Upvotes: 0
Views: 947
Reputation: 385857
You have:
For every line, print this block if this line doesn't match the pattern.
But you want:
For every line, print this line if none of the other lines in the block match the pattern.
As such, you can't start printing the block before you examined every line in the block (or at all lines until you find a matching line).
local $/ = q||;
while (<MYFILE>) {
my @lines = split /\n/;
my $skip = 0;
for my $line ( @lines ) {
if ( $line =~ m/^xref:\s*type3/ ) {
$skip = 1;
last;
}
}
if (!$skip) {
for my $line ( @lines ) {
print NEWFILE $line;
}
}
}
But there's no need to split into lines. We can check and print the whole block at once.
local $/ = q||;
while (<MYFILE>) {
print NEWFILE $_ if !/^xref:\s*type3/m;
}
(Note the /m
to make ^
match the start of any line.)
Upvotes: 3
Reputation: 241908
Do not process the records line by line. Use a paragraph mode:
{ local $/ = q();
while (<MYFILE>) {
if (! /xref:\s*type3/ ) {
printf NEWFILE qq|%s|, $_;
last;
}
}
Upvotes: 1
Reputation: 61510
The problem is that you are using unless
with !~
which is interpreted as if $line
does not NOT match do this. ( a double negative )
When using the unless
block with the normal pattern matching operator =~
you code worked perfectly, that is I see the first block as output because it does not contain type3.
LOOP:
while (<$MYFILE>) {
BEGIN { $/ = q|| }
my @lines = split /\n/;
for my $line ( @lines ) {
unless ( $line =~ m/xref:\s*type3/ ) {
printf qq|%s|, $_;
last LOOP;
}
}
}
# prints
# [Term]
# id: id1
# name: name1
# xref: type1:aab
# xref: type2:cdc
Upvotes: 1