Nishant Kansal
Nishant Kansal

Reputation: 601

Merge files with unequal number of rows based upon a particular column in bash

I am trying to merge 2 files with an unequal number of rows and based upon column 2 in each file.

File 1

  3 09
 10 11
 28 12
 23 14

File 2

 10 10
 11 11
 29 12
 25 13
 25 14

After merge, the file should be-

 09 3 0
 10 0 10
 11 10 11
 12 28 29
 13 0 25
 14 23 25

In another requirement, if file 1 or file 2 doesn't have a corresponding entry, I have to insert 0 there. The modified file 1, file 2 and the merged file is shared above.

Please help.

Upvotes: 1

Views: 852

Answers (4)

anubhava
anubhava

Reputation: 785008

You may use this gnu-awk command that would work irrespective of the fact which of the 2 files have more records:

awk 'FNR == NR {
   map[$2] = $1
   next
}
{
   map[$2] = ($2 in map ? map[$2] OFS : "") $1
}
END {
   PROCINFO["sorted_in"] = "@ind_str_asc"
   for (i in map)
      print i, map[i]
}' file1 file2

09 3
10 9 10
11 10 11
12 28 29
13 27 25
14 23 25

Upvotes: 3

vijay v
vijay v

Reputation: 2076

How about using join?

join -12 -22 -a1 -e 0 -o'0,1.1,2.1' f1 f2

Hope it is what you want. The fields that don't have a match will be given 0.

$ cat f1
3 09
9 10
10 11
28 12
27 13
23 14

$ cat f2
10 10
11 11
29 12
25 13
25 14

$ join -12 -22 -a1 -e 0 -o'0,1.1,2.1' f1 f2
09 3 0
10 9 10
11 10 11
12 28 29
13 27 25
14 23 25

The join command joins the lines of two files which share a common field of data. In this case: Join the file2 and the file1 using the field 1 ( -1 2) of the file f2 and the field 2 ( -2 2) of the file f1.

The output will be: joined field, field 1 of file1, field 1 of file2 -- -o'0,1.1,2.1', if there is a missing field put 0 -e0

If one of the two files have more records then add them (in this case file1) -a1

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203254

$ join -j 2 -a 1 -a 2 file1 file2
09 3
10 9 10
11 10 11
12 28 29
13 27 25
14 23 25

or if the real input files aren't already sorted on the 2nd column as in your sample:

$ join -j 2 -a 1 -a 2 <(sort -k2,2 file1) <(sort -k2,2 file2)
09 3
10 9 10
11 10 11
12 28 29
13 27 25
14 23 25

Upvotes: 2

Tyl
Tyl

Reputation: 5252

Try this pls:

awk 'NR==FNR{a[$2]=$1;next}{print $2,$1,a[$2]}' file2 file1

Upvotes: 1

Related Questions