Reputation: 5191
I need help with a script which takes 2 Files as Input:
File1: [TEXT] contains Paragraphs SEPARATED with BLANK LINES
File2: [SEARCH KEYS] Paragraphs SEPARATED with BLANK LINES
and creates an Output File: File3 - which contains TEXT from File1 EXCEPT those PARAGRAPHS which EXACTLY MATCHES with those provided in File2.
i.e. The Script needs to search Each Paragraph Given in File1 -- in File2. IF a PERFECT MATCH (with ALL MATCHING LINES) is found, drop the Matching Paragraph from Output File3.
Given: 2 Files
File1:
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
PARA2_LINE1
PARA2_LINE2
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
File2:
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
PARA2_LINE1
Required Output:
File3:
PARA2_LINE1
PARA2_LINE2
Note: The Second Paragraph [PARA2] is NOT a complete Match, hence it should not be ommited from File 3
Thanks
Upvotes: 1
Views: 428
Reputation: 77135
This awk
should work:
awk -v RS= -v ORS='\n\n' 'NR==FNR{a[$0]++;next}!($0 in a)' file2 file1
RS
variable. a
from file2.
$ cat file1
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
PARA2_LINE1
PARA2_LINE2
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
$ cat file2
PARA1_LINE1
PARA1_LINE2
PARA1_LINE3
PARA2_LINE1
$ awk -v RS= -v ORS='\n\n' 'NR==FNR{a[$0]++;next}!($0 in a)' file2 file1
PARA2_LINE1
PARA2_LINE2
Upvotes: 2
Reputation: 35208
Utilizing the input record separator $/
to process in paragraph mode. Note, I didn't chomp
since the last record might have only a single return.
use strict;
use warnings;
if (@ARGV != 2) {
print "Usage: $0 [Text File] [Search Key File]\n";
exit;
}
my $file1 = shift;
local $/ = "\n\n";
my %para;
while (<>) {
s/\n+$//;
$para{$_} = 1;
}
local @ARGV = $file1;
while (<>) {
s/\n+$//;
print $_,$/ if ! $para{$_};
}
Upvotes: 1