Jordan.lamarche
Jordan.lamarche

Reputation: 23

Joining two files with multiple columns via AWK

First of all, I must apologise : I know there's a lot of various topics that already answer my question, but as you'll see by yourself, AWK isn't really a big friend of mine.

You all know the story, right ? ;) "Hey random employee, you are the chosen one ! I need you to learn this strange thing that none of us know. Your deadline is tomorrow, good luck !"

I won't complain about it anymore (promise ! :p), but after many tries, I can't really understand everything (who said "a single thing" ?) about AWK.

So, here are my questions !

I have two files, with the following columns :

File A.txt :

A B C D E F G H

File B.txt :

A C F I

I want to get the following output by joining these two files in another one :

Ouput file C.txt :

A B C D E F G H I

I would like to make a join between them, adding "I" to already existent lines with columns A, C and F, and removing the other ones.

So far, I know that I must use something like this :

awk '
    FNR==NR{Something ;next}
    {print $0}
' A.txt B.txt

Yeah, I know. Sounds pretty bad for a start.

Any hero, over there ?

Upvotes: 1

Views: 1232

Answers (1)

glenn jackman
glenn jackman

Reputation: 246754

awk '
    NR==FNR {A[$1,$3,$6] = $0; next} 
    ($1 SUBSEP $2 SUBSEP $3) in A {print A[$1,$2,$3], $4}
' A.txt B.txt

That requires the whole file A.txt to be stored in memory. If B.txt is significantly smaller

awk '
    NR==FNR {B[$1,$2,$3] = $4; next}
    ($1 SUBSEP $3 SUBSEP $6) in B {print $0, B[$1,$3,$6]}
' B.txt A.txt

Upvotes: 4

Related Questions