Reputation: 255
I want the match to occur but it's not happening.
I have an issue. I wish to find occurences of lines from one file in another.
Here's one file (@file)
735 1 1
1891 1 0
2021 1 1
1892 2 1
667 1 0
802 2 1
665 1 0
666 1 1
596 1 0
3193 2 1
Here's the one in which I have to find above lines (@file1)
1521 1 0 : 1167 0 0 : 1167 2 0 : 1167 1 0 ;
2605 1 1 ;
2280 0 1 : 2280 2 0 : 1892 0 0 : 2280 1 0 : 2021 0 0 ;
1892 2 1 : 667 0 1 : 667 1 0 ;
1892 1 1 ;
Here's the code I wrote
foreach $leadline (@file1) {
foreach $line (@file) {
$_ = ' ' . $leadline;
$line = ' ' . $line;
if (m/$line/) {
push @final, $_;
}
}
}
But I am unable to detect the lines.
@file1
and @file
variables store the contents of the files.
I either get no lines detected or all lines detected.
The reason I am concatenating a Space before the two lines is , that sometimes 667 1 0 can occur as the very first phrase in a given line.I am not comfortable with Regex to do that in Regex directly.
Note :- If line i and line j in first file occur as a pattern in the same line of the other file then output should be just one of the lines. Also , if a pattern 1667 1 0 is found, it shouldn't be confused with 667 1 0. Hence I added the whitespace.
I was able to achieve this goal in Python but unable to replicate it in Perl . Here's the Python snippet :-
for line1 in file1:
for j in range(0,len(file0)-1):# in file0:
if ' '+lines[j][0:len(file0[j])-1] in ' '+line1:
i = i + 1
print line1[0:len(line1)-1]
break
Expected output is :- 1892 2 1 : 667 0 1 : 667 1 0 ;
Upvotes: 0
Views: 142
Reputation: 126722
I now think this is a solution to a different problem, but here it is anyway!
use warnings;
use strict;
use 5.010;
use Array::Utils 'array_diff';
open my $fh, '<', 'f1.txt' or die $!;
my @f1;
while ( <$fh> ) {
push @f1, [split];
}
my @final;
open $fh, '<', 'f2.txt' or die $!;
while ( <$fh> ) {
my @f2 = map [ /\d+/g ], split /:/;
for my $f1 ( @f1 ) {
my @matches = grep { not array_diff(@$f1, @$_) } @f2;
push @final, map "@$_", @matches;
}
}
say for @final;
output
1892 2 1
667 0 1
667 1 0
Update
Okay here's my second attempt! This is essentially what choroba
wrote but using map
and with the addition of stripping all trailing whitespace on the data from the first file.
use warnings;
use strict;
use 5.014; # For non-destructive substitution
open my $fh, '<', 'f1.txt' or die $!;
my @f1 = map s/\s+\z//r, <$fh>;
my $re = join '|', @f1;
open $fh, '<', 'f2.txt' or die $!;
my @final = grep /\b(?:$re)\b/, <$fh>;
print for @final;
output
1892 2 1 : 667 0 1 : 667 1 0 ;
Upvotes: 1
Reputation: 91415
Here is how I'd do the job:
use Modern::Perl;
use Data::Dumper;$Data::Dumper::Indent = 1;
my @file = (
'735 1 1',
'1891 1 0',
'2021 1 1',
'1892 2 1',
'667 1 0',
'802 2 1',
'665 1 0',
'666 1 1',
'596 1 0',
'3193 2 1',
);
my @final;
while(my $line = <DATA>) {
chomp $line;
if (grep{$line =~ /\b$_\b/} @file) {
push @final, $line;
}
}
say Dumper\@final;
__DATA__
1521 1 0 : 1167 0 0 : 1167 2 0 : 1167 1 0 ;
2605 1 1 ;
2280 0 1 : 2280 2 0 : 1892 0 0 : 2280 1 0 : 2021 0 0 ;
1892 2 1 : 667 0 1 : 667 1 0 ;
1892 1 1 ;
Output:
$VAR1 = [
'1892 2 1 : 667 0 1 : 667 1 0 ; '
];
With your files:
use Modern::Perl;
use Data::Dumper;$Data::Dumper::Indent = 1;
open my $fh, '<', 'file.txt' or die "unable to open 'file.txt': $!";
my @file = <$fh>;
chomp @file;
my @final;
open $fh, '<', 'file1.txt' or die "unable to open 'file1.txt': $!";
while(my $line = <$fh>) {
chomp $line;
if (grep{$line =~ /\b$_\b/} @file) {
push @final, $line;
}
}
say Dumper\@final;
Upvotes: 1
Reputation: 241858
You can create a regex by joining the lines from file1 by |
(and applying quotemeta
on each). \b
should prevent matching 667
in 1667
.
#!/usr/bin/perl
use warnings;
use strict;
my @search;
open my $F1, '<', 'file1' or die $!;
while (<$F1>) {
chomp;
push @search, quotemeta;
}
my $regex = join '|', @search;
$regex = qr/\b(?:$regex)\b/;
open my $F2, '<', 'file2' or die $!;
while (<$F2>) {
print if /$regex/;
}
Upvotes: 1