Reputation: 193
I have two files, I need to compare their first columns and if the match is found, I'd like to output the corresponding values from both files.
Similar to this Q but I'd like to print columns from both files not one: How to compare multiple columns in two files and retrieve the corresponding value from another column if match found
File1.txt
adeqY 33.7
AIsLX 65.6
AmuBv 1589.0
aZMIx 84.4
File2.txt
AmuBv foo
iwwlp bar
adeqY hi
qUbJZ bye
Output
hi 33.7
foo 1589.0
I have the following awk
command but I only managed to print the second column match from File2:
awk 'FNR==NR{a[$1]; next} ($1) in a {print $2 a[$2]}' File1.txt File2.txt
a[$2]
doesn't want to print
Thanks in advance.
Upvotes: 4
Views: 3009
Reputation: 8711
Try Perl variant
$ cat eskp1.txt
adeqY 33.7
AIsLX 65.6
AmuBv 1589.0
aZMIx 84.4
$ cat eskp2.txt
AmuBv foo
iwwlp bar
adeqY hi
qUbJZ bye
$ perl -F"\s+" -lane 'BEGIN { %kv=map{split /\s+/ } qx(cat eskp1.txt) } { print "$F[1] $kv{$F[0]}" if $kv{$F[0]} } ' eskp2.txt
foo 1589.0
hi 33.7
Upvotes: 0
Reputation: 15472
What you are trying to do there is essentially an INNER JOIN on two tables stored in text files and the Linux join command is designed for just that.
Try:
join -t' ' -1 1 -2 1 -o 2.2,1.2 <(sort file1.txt) <(sort file2.txt)
foo 1589.0
hi 33.7
Explanation:
-1 1 -2 1
is says to join on the first field for the left-hand side file and the first field on the right-hand side file.-o 2.2,1.2
defines the fields you want to return. The second field from the second file then the second field from the first file.See also this blog (ref).
Upvotes: 1
Reputation: 16824
What's happening here is that, once you're printing results in the last awk statement, the a[]
array is no longer in scope, hence why that second value is not printing.
Maybe there is another way to do this awk
, but this is the solution I came up with:
for each in $(comm -1 -2 <(awk '{print $1 }' file1.txt | sort ) <(awk '{print $1 }' file2.txt | sort) ); do echo $(grep $each file2.txt | awk '{print $2}') $(grep $each file1.txt | awk '{print $2}') ; done;
This outputs:
foo 1589.0
hi 33.7
Explanation:
comm
command on the two files.comm
are actually process substituted, so that they are sorted first (comm
expects sorted input) and only the first column is shown.-1 -2
parameters to comm
tells it to only print the common elements from the files (it can show elements unique to first file, unique to second file, or common to both)for each
of those common elements, go ahead and grep each of the files for the line where it appears, and show only the second value, using awk.So, in the end, we needed a bash for
loop, sort
, comm
and awk
multiple times. Not the most elegant solution probably, but it gets the job done.
Upvotes: 0
Reputation: 133428
Could you please try following.
awk 'FNR==NR{a[$1]=$2;next} ($1 in a){print $2,a[$1]}' Input_file1 Input_file2
Output will be as follows.
foo 1589.0
hi 33.7
Problem in your attempt: You was going good only thing in FNR==NR
condition your a[$1]
is NOT having any value it only created its index in array a
so that is why it was not able to print anything when 2nd Input_file is being read.
Upvotes: 2