Reputation: 1108
I have 15 different files that I want have a new file which include only common lines in all of them. for example:
File1:
id1
id2
id3
file2:
id2
id3
id4
file3:
id10
id2
id3
file4
id100
id45
id3
id2
I need the output be like:
newfile:
id2
id3
I know this command works for each pair of files:
grep -w -f file1 file2 > output
but i need a command to works for more than 2 files.
any suggestion please?
Upvotes: 3
Views: 388
Reputation: 23697
The zet command provides set operations between input files. Use the intersect
option to get common lines across all the input files. The input content doesn't have to be sorted. The output order will be same as the order of input lines.
$ zet intersect file1 file2 file3 file4
id2
id3
Here's some relevant details from the notes section:
Upvotes: 0
Reputation: 242373
Perl to the rescue:
perl -lne 'BEGIN { $count = @ARGV }
$h{$_}{$ARGV} = 1;
}{
print $_ for grep $count == keys %{ $h{$_} }, keys %h
' file* > newfile
-n
reads the input files line by line-l
adds a newline to print
@ARGV
array contains the input file names, assigning it to $count
at the BEGIN
ning just counts them $ARGV
contains the name of the current input file$_
contains the current line read from the file.%h
hash contains ids as keys, each key contains a hash reference with file names that contained the id as keys}{
is the "Eskimo greeting" operator, it introduces code that runs once the input is exhaustedUpvotes: 6
Reputation: 113994
The same trick can be used more than once:
$ grep -w -f file1 file2 | grep -w -f file3 | grep -w -f file4
id2
id3
By the way, if you are looking for exact matches, not a regular expression matches, it is better and faster to use the -F
flag:
$ grep -wFf file1 file2 | grep -wFf file3 | grep -wFf file4
id2
id3
$ awk 'FNR==1{nfiles++; delete fseen} !($0 in fseen){fseen[$0]++; seen[$0]++} END{for (key in seen) if (seen[key]==nfiles) print key}' file1 file2 file3 file4
id3
id2
FNR==1{nfiles++; delete fseen}
Every time that we start reading a new file, we do two things: (1) increment the file counter, nfiles
. and (2) delete the array fseen
.
!($0 in fseen){fseen[$0]; seen[$0]++}
If the current line is not a key in fseen
, then add it to fseen
and increment the count for this line in seen
.
END{for (key in seen) if (seen[key]==nfiles) print key}
After we have read the last line of the last file, we look at every key in seen
. If the count for that key is equal to the number of files that we have read, nfiles
, then we print that key.
Upvotes: 4
Reputation: 18411
grep -hxf file1 file2 file3 file4 |sort -u
id2
id3
# For storing it to any file,
grep -hxf file1 file2 file3 file4 |sort -u > output.txt
Upvotes: 1