Reputation: 147
Trying to combine data into one line where some fields match.
12345,this,is,one,line,1
13567,this,is,another,line,3
14689,and,this,is,another,6
12345,this,is,one,line,4
14689,and,this,is,another,10
Output
12345,this,is,one,line,1,4
13567,this,is,another,line,3
14689,and,this,is,another,6,10
Thanks
Upvotes: 1
Views: 275
Reputation: 1118
awk -F',' '{if($1 in a) {a[$1]=a[$1] "," $NF} else {a[$1]=$0}} END {asort(a); for(i in a) print a[i]}' < input.txt
Works well with given example.
Here is commented file version of the same awk script, parse.awk. Keep in mind that this version use only first field as unified row indicator. I'll rewrite it according author's comment above (all fields but the last one).
#!/usr/bin/awk -f BEGIN { # BEGIN section is executed once before input file's content FS="," # input field separator is comma (can be set with -F argument on command line) } { # main section is executed on every input line if($1 in a) { # this checks is array 'a' already contain an element with index in first field a[$1]=a[$1] "," $NF # if entry already exist, just concatenate last field of current row } else { # if this line contains new entry a[$1]=$0 # add it as a new array element } } END { # END section is executed once after last line asort(a) # sort our array 'a' by it's values for(i in a) print a[i] # this loop goes through sorted array and prints it's content }
Use this via
./parse.awk input.txt
Here is another version which takes all but the last field to compare rows:
#!/usr/bin/awk -f BEGIN { # BEGIN section is executed once before input file's content FS="," # input field separator is comma (can be set with -F argument on command line) } { # main section is executed on every input line idx="" # reset index variable for(i=1;i<NF;++i) idx=idx $i # join all but the last field to create index if(idx in a) { # this checks is array 'a' already contain an element with index in first field a[idx]=a[idx] "," $NF # if entry already exist, just concatenate last field of current row } else { # if this line contains new entry a[idx]=$0 # add it as a new array element } } END { # END section is executed once after last line asort(a) # sort our array 'a' by values for(i in a) print a[i] # this loop goes through sorted array and prints it's content }
Feel free to ask any further explanation.
Upvotes: 2
Reputation: 58391
This might work for you (GNU sed and sort):
sort -nt, -k1,1 -k6,6 file |
sed ':a;$!N;s/^\(\([^,]*,\).*\)\n\2.*,/\1,/;ta;P;D'
Upvotes: 0