Road King
Road King

Reputation: 147

Combine some data from multiple lines

Trying to combine data into one line where some fields match.

12345,this,is,one,line,1
13567,this,is,another,line,3
14689,and,this,is,another,6
12345,this,is,one,line,4
14689,and,this,is,another,10

Output

12345,this,is,one,line,1,4
13567,this,is,another,line,3
14689,and,this,is,another,6,10

Thanks

Upvotes: 1

Views: 275

Answers (2)

igustin
igustin

Reputation: 1118

awk -F',' '{if($1 in a) {a[$1]=a[$1] "," $NF} else {a[$1]=$0}} END {asort(a); for(i in a) print a[i]}' < input.txt

Works well with given example.

Here is commented file version of the same awk script, parse.awk. Keep in mind that this version use only first field as unified row indicator. I'll rewrite it according author's comment above (all fields but the last one).

#!/usr/bin/awk -f

BEGIN {   # BEGIN section is executed once before input file's content
    FS=","   # input field separator is comma (can be set with -F argument on command line)
}

{   # main section is executed on every input line
    if($1 in a) {   # this checks is array 'a' already contain an element with index in first field
        a[$1]=a[$1] "," $NF   # if entry already exist, just concatenate last field of current row
    }
    else {   # if this line contains new entry
        a[$1]=$0   # add it as a new array element
    }
}

END {   # END section is executed once after last line
    asort(a)   # sort our array 'a' by it's values
    for(i in a) print a[i]   # this loop goes through sorted array and prints it's content
}

Use this via

./parse.awk input.txt

Here is another version which takes all but the last field to compare rows:


#!/usr/bin/awk -f

BEGIN {   # BEGIN section is executed once before input file's content
    FS=","   # input field separator is comma (can be set with -F argument on command line)
}

{   # main section is executed on every input line
    idx=""   # reset index variable
    for(i=1;i<NF;++i) idx=idx $i   # join all but the last field to create index
    if(idx in a) {   # this checks is array 'a' already contain an element with index in first field
        a[idx]=a[idx] "," $NF   # if entry already exist, just concatenate last field of current row
    }
    else {   # if this line contains new entry
        a[idx]=$0   # add it as a new array element
    }
}

END {   # END section is executed once after last line
    asort(a)   # sort our array 'a' by values
    for(i in a) print a[i]   # this loop goes through sorted array and prints it's content
}

Feel free to ask any further explanation.

Upvotes: 2

potong
potong

Reputation: 58391

This might work for you (GNU sed and sort):

sort -nt, -k1,1 -k6,6 file | 
sed ':a;$!N;s/^\(\([^,]*,\).*\)\n\2.*,/\1,/;ta;P;D'

Upvotes: 0

Related Questions