Reputation: 83
I have fixed length file (sorted) and need to concatenate field from multiple rows based on key fields. The key fields are first two columns. The first 3 rows has white spaces at the beginning. The actual file does not have "spaces" word I have put that here to keep the formatting.
Input file
spaces2 001 111 111 Company Code 227
spaces2 001 111 111 Plan Code 012
spaces2 001 111 111 Invoice Number 2014010
0000002 2 111 111 Company Code 214
0000002 2 111 111 Plan Code 20
0000002 2 111 111 Invoice Number 3014010
Desired output:
spaces2 001 111 111 Company Code 227-12-2014010
0000002 2 111 111 Company Code 214-20-3014010
Could someone please suggest how to do this.
Upvotes: 2
Views: 398
Reputation: 77185
Try something like this:
$ cat file
2 001 111 111 Company Code 227
2 001 111 111 Plan Code 012
2 001 111 111 Invoice Number 2014010
0000002 2 111 111 Company Code 214
0000002 2 111 111 Plan Code 20
0000002 2 111 111 Invoice Number 3014010
$ awk '(($1,$2) in a){a[$1,$2]=a[$1,$2]"-"$NF;next}{gsub(/ *$/,"");a[$1,$2]=$0}END{for(x in a) print a[x]}' file
2 001 111 111 Company Code 227-012-2014010
0000002 2 111 111 Company Code 214-20-3014010
Note that arrays in awk
are associative so the order of output will be random. If order is important then it needs to be handled with additional code.
If order is critical for you then you can do something like:
awk '($1,$2) in seen {
lines[i] = lines[i]"-"$NF;
next
}
{
gsub(/ *$/,"");
seen[$1,$2]++;
lines[++i] = $0
}
END {
for(line = 1; line <= length(lines); line++) {
print lines[line]
}
}' file
Explaination:
($1,$2) in seen
: This looks for first two fields in array seen. If it is present, it appends the last field to array line and next
skips rest of the action.{ .. }
- This code block gets executed for lines that are not in array seen. Typicall your first line will always enter here. So we use gsub
to remove trailing space and populate the array seen
and store that line in array line
. END
block, we iterate over the array line
and print it. Since we use incremental index, the order is preserved.
Note: This expects your input data to be orderedUpvotes: 2