Htat Ko Lin
Htat Ko Lin

Reputation: 43

How to merge separated fields in rows into one based on common fields in AWK / UNIX

I am still just a new user to UNIX and especially to AWK. I am having the problem to merge rows based on first 2 columns values. My original data from a file as below:

Original data content
========================

ID1 ID2 Field1  Field2
1   1   11F1    11F2
1   2   12F1    12F2
2   1   21F1    21F2
2   2   22F1    22F2
ID1 ID2 Field3  Field4
1   1   11F3    11F4
1   2   12F3    12F4
2   1   21F3    21F4
2   2   22F3    22F4
ID1 ID2 Field5  Field6
1   1   11F5    11F6
1   2   12F5    12F6
2   1   21F5    21F6
2   2   22F5    22F6

As you noticed, columns are split into different rows/block but IDs fields and columns heading are still available and repeated. So what i want to achieve is as follow:

ID1 ID2 Field1  Field2  Field3  Field4  Field5  Field6
1   1   11F1    11F2    11F3    11F4    11F5    11F6
1   2   12F1    12F2    12F3    12F4    12F5    12F6
2   1   21F1    21F2    21F3    21F4    21F5    21F6
2   2   22F1    22F2    22F3    22F4    22F5    22F6

Merge all into as a single block/table. But don't know how to do it in AWK, or is it possible do achieve with AWK.

Thank a lot. Htat Ko

Upvotes: 2

Views: 897

Answers (2)

nwk
nwk

Reputation: 4050

You can do this without Awk as well:

$ sort -n data | sed -e 's/  */ /g' | paste - - - -d' ' | cut -d' ' -f 1,2,3,4,7,8,11,12

Output

ID1 ID2 Field1 Field2 Field3 Field4 Field5 Field6
1 1 11F1 11F2 11F3 11F4 11F5 11F6
1 2 12F1 12F2 12F3 12F4 12F5 12F6
2 1 21F1 21F2 21F3 21F4 21F5 21F6
2 2 22F1 22F2 22F3 22F4 22F5 22F6

Upvotes: 1

jaypal singh
jaypal singh

Reputation: 77185

Yes, it is possible using awk:

awk ' 
{ key = $1 FS $2 }
!seen[key]++ { keys[++total] = key }
{ values[key] = ( key in values ? values[key] FS $3 FS $4 : $3 FS $4 ) }
END {
    for (cnt=1; cnt<=total; cnt++) 
    print keys[cnt], values[keys[cnt]]
}' file
  • Form the first and second column as key
  • Using an array seen remember the order of occurrence.
  • Test if your key is present in an array (we are using ternary op here). If it is present, append the existing value with new data set. If it is not present, push it as the value.
  • In the END block, iterate and print.
  • If you have many columns then store the column in a variable and remove the keys from it before storing.

Upvotes: 3

Related Questions