Reputation: 1
Kindly let me know how to compare 2 files and merge both matching and non matching lines. I have checked all the answers provide previously but nothing suits my requirement.Please find the Sample data sets below
Contents of file1.csv
:
J2D TEXAS 43988
J2D AUSTIN 21305
J2D CUPERTINO 378563
J2D BELGIUM 569632
J2D UK 0
J2D US 8
J2D INDIA 75321
Contents of file2.csv
:
J2D TEXAS 25463
J2D AUSTIN 5986
J2D CUPERTINO 0234
J2D BELGIUM 123468
J2D UK 5874
J2D US 2365
J2D IRAQ 8982
I have tried below command but it is not working in my scenario:
awk 'NR==FNR{a[$2]=$3;next;}{print $0 " " ($2 in a ? a[$2] : "NA")}' file2.csv file1.csv
Output:
J2D TEXAS 43988 25463
J2D AUSTIN 21305 5986
J2D CUPERTINO 378563 0234
J2D BELGIUM 569632 123468
J2D UK 0 5874
J2D US 8 2365
J2D INDIA 75321 NA
In the above result, you can see "IRAQ" from file2.csv
is missing.
awk 'NR==FNR{a[$2]=$3;next;}{print $0 " " ($2 in a ? a[$2] : "NA")}' file1.csv file2.csv
Output:
J2D TEXAS 25463 43988
J2D AUSTIN 5986 21305
J2D CUPERTINO 0234 378563
J2D BELGIUM 123468 569632
J2D UK 5874 0
J2D US 2365 8
J2D IRAQ 8982 NA
In the above result you can see "INDIA" from file1.csv
is missing
Below is the expected output. kindly help me to get the desired output
EXPECTED OUTPUT:
J2D TEXAS 43988 25463
J2D AUSTIN 21305 5986
J2D CUPERTINO 378563 0234
J2D BELGIUM 569632 123468
J2D UK 0 5874
J2D US 8 2365
J2D INDIA 75321 NA
J2D IRAQ NA 8982
Upvotes: 0
Views: 360
Reputation: 67467
awk
to the rescue!
$ awk '{k=$1 FS $2}
NR==FNR {a[k]=$3; next}
{print $0, (k in a)?a[k]:"NA"; delete a[k]}
END {for(k in a) print k,"NA",a[k]}' file2 file1 | column -t
J2D TEXAS 43988 25463
J2D AUSTIN 21305 5986
J2D CUPERTINO 378563 0234
J2D BELGIUM 569632 123468
J2D UK 0 5874
J2D US 8 2365
J2D INDIA 75321 NA
J2D IRAQ NA 8982
Upvotes: 0
Reputation: 92854
paste + awk solution:
paste file1.csv file2.csv | awk '{ if($2==$5) { print $1,$2,$3,$6 }
else { print $1,$2,$3,"NA","\n",$4,$5,"NA",$6 }}' | column -tx
The output:
J2D TEXAS 43988 25463
J2D AUSTIN 21305 5986
J2D CUPERTINO 378563 0234
J2D BELGIUM 569632 123468
J2D UK 0 5874
J2D US 8 2365
J2D INDIA 75321 NA
J2D IRAQ NA 8982
Details:
paste file1.csv file2.csv
- merge lines of files
if($2==$5) { print $1,$2,$3,$6 }
- if files matched by the second column ($5
field points to the early second column of file2.csv
)
print $1,$2,$3,"NA","\n",$4,$5,"NA",$6
- print unmacthed lines as separate lines with NA
in relative positions
http://man7.org/linux/man-pages/man1/paste.1.html
Upvotes: 0