Reputation: 63
I am studying the sed command but I have a problem when combining 2 files.
file1.txt
A 1
C 3
E 5
file2.txt
1 John Lennon
2 Mariah carey
3 Cool & The Gang
4 Westlife
5 Red Hot Chili Peppers
desired output
1 John Lennon A
2 Mariah Carey
3 Cool & The Gang C
4 Westlife
5 Red Hot Chili Peppers E
I try to make awk script like this:
awk 'FNR==NR{seen[$1]=$2; next} $1 in seen{seen[$1]=seen[$1] OFS $2} END{ for (e in seen) print e, seen[e]}' file2.txt file1.txt | sort -V
but this output only display one word of the singer (John, Mariah, Cool, Westlife, and Red) and does not display the singer's full name. Is something wrong with my script?
Upvotes: 0
Views: 94
Reputation: 52344
If the columns of the two files are separated by tabs instead of spaces (Looks like the first one is, second one I don't know; unfortunately SO's markdown is not tab friendly), it's a trivial join
:
$ join -12 -21 -o 0,2.2,1.1 -t$'\t' -a2 <(sort -t$'\t' -k2,2 file1.txt) <(sort -t$'\t' -k1,1 file2.txt)
1 John Lennon A
2 Mariah carey
3 Cool & The Gang C
4 Westlife
5 Red Hot Chili Peppers E
(join
requires its files to be sorted lexicographically on the join column, not numerically, hence the sort
s).
If there's just a space between the number and the band in file2, convert it to a tab first with sed
:
join -12 -21 -o 0,2.2,1.1 -t$'\t' -a2 <(sort -t$'\t' -k2,2 file1.txt) <(sed 's/ /\t/' file2.txt | sort -t$'\t' -k1,1)
Upvotes: 1
Reputation: 784998
This can be done using a fairly simple 2 step process in awk
and there is no need to use sort
since we can process file2
in second phase:
awk 'FNR==NR{seen[$2]=$1; next} $1 in seen{$0 = $0 OFS seen[$1]} 1' file1 file2
1 John Lennon A
2 Mariah carey
3 Cool & The Gang C
4 Westlife
5 Red Hot Chili Peppers E
Upvotes: 2
Reputation: 133458
You should use following code, I have made minor changes in your attempt.
awk '
FNR==NR{
val=$1
$1=""
sub(/^ +/,"")
seen[val]=$0
next
}
$2 in seen{
print $2,seen[$2],$1
b[$2]
next
}
END{
for(i in seen){
if(!(i in b)){
print i,seen[i]
}
}
}
' file2.txt file1.txt | sort -V
Output will be as follows.
1 John Lennon A
2 Mariah carey
3 Cool & The Gang C
4 Westlife
5 Red Hot Chili Peppers E
Problem with OP's attempted code:
John
OR Mariah
and so on.Upvotes: 1