Reputation: 1403
I want to compare 2nd column of file2 with 1st column of file1. If they are equal i want to add the 2nd column of file1 to file2 as shown in output.txt.
file2
chr5 ENST00000514151 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
file1
ENST00000512281 a
ENST00000504031 b
ENST00000776348 c
output.txt
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
I was able compare the files with
awk 'NR==FNR{a[$1];next}$2 in a{print}' file1 file2
This gives below output:
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
chr5 ENST00000512281 utr5 0 +
But I do not know how to add the 2nd colum of file1 into the output.
Upvotes: 1
Views: 5870
Reputation: 40778
You can store the value of $2
in file1
into the array using a[$1]=$2
. So you could try:
awk '
NR==FNR{
a[$1]=$2 ; next }
$2 in a {
$1=$1 FS a[$2]
print
}' file1 file2
Output:
chr5 b ENST00000504031 utr5 0 +
chr5 b ENST00000504031 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
chr5 a ENST00000512281 utr5 0 +
Explanation:
$1
in file2
using $1=$1 FS a[$2]
where FS
is the default field separator, which is a space.. and then rebuilds the record, such that it can be printed by print
later..print
can be simplified to a 1
if desired.. Like $2 in a { $1=$1 FS a[$2] }1
file2
and thus any sequences of spaces or tabs will be truncated to a single space in the output. To keep the original formatting in file2
one could use the split()
function in Gnu Awk version 4..Upvotes: 2