Reputation: 169
I have two files as below
file1
name|address office
AK|Victoria Street
BK|Admond Street
DK|Business Street
file2
name|address home
AK|Nilofer Villa
ck|Bluewaters
bk|Homingo Apartment
the command or line of code should compare the first column of the two files and merge the columns as name|address office|address home
, and replace the NA wherever is not matched,content of the files can be huge. full output should be as below
file3
name|address office|address home
AK|Victoria Street |Nilofer Villa
BK|Admond Street|Homingo Apartment
DK|Business Street|NA
CK|NA|Bluewaters
here is what I have tried so far:
awk -F '|' 'NR==FNR{c[$1]++;next};c[$1] > 0' file1 file2
but above lines of code are not merging, just producing the output as difference based on column name. that too case sensitive
name|address home
AK|Nilofer Villa
Please Help, have checked few questions also, but not solving my purpose.
Upvotes: 1
Views: 74
Reputation: 203522
$ cat tst.awk
BEGIN { FS=OFS="|" }
{
name = (FNR>1 ? toupper($1) : $1)
if (!seen[name]++) {
names[++numNames] = name
vals[name,1] = vals[name,2] = "NA"
}
vals[name,ARGIND] = $2
}
END {
for (nameNr=1; nameNr<=numNames; nameNr++) {
name = names[nameNr]
print name, vals[name,1], vals[name,2]
}
}
$ awk -f tst.awk file1 file2
name|address office|address home
AK|Victoria Street|Nilofer Villa
BK|Admond Street|Homingo Apartment
DK|Business Street|NA
CK|NA|Bluewaters
The above uses GNU awk for ARGIND, with other awks just add FNR==1{ARGIND++}
at the start of the script.
Upvotes: 2
Reputation: 39414
You can do this with the join
command:
$ join -a 1 -a 2 -e NA -o '0,1.2,2.2' -t '|' -i <(sort f1) <(sort f2)
AK|Victoria Street|Nilofer Villa
BK|Admond Street|Homingo Apartment
ck|NA|Bluewaters
DK|Business Street|NA
name|address office|address home
Where:
-a 1
and -a 2
include the unjoined lines from either file.-e
and -o
work together to show the "NA" field. The man page doesn't mention this, but to use -e
you must specify -o
. We simply show the fields in this order: join column, second column from first file, second column from second file.-t
sets the delimiterOf course we must also sort the files prior to using join (it's requisite), so we use process substitution. If your shell doesn't have that, you may use temporary files.
Upvotes: 4