Reputation: 561
Hi I need to do this in example bellow:
input file:
chr17 41246351 41246352 NM_007294_Exon_10
chr17 41246351 41246352 NM_007297_Exon_9
chr17 41246351 41246352 NM_007300_Exon_10
chr17 41246351 41246352 NR_027676_Exon_10
chr17 41246352 41246353 NM_007294_Exon_10
chr17 41246352 41246353 NM_007297_Exon_9
chr17 41246352 41246353 NM_007300_Exon_10
Get output like this:
chr17 41246351 41246352 NM_007294_Exon_10,NM_007297_Exon_9,NM_007300_Exon_10,NR_027676_Exon_10
chr17 41246352 41246353 NM_007294_Exon_10,NM_007297_Exon_9,NM_007300_Exon_10
I was try to use uniq
and sort
, but with no success. Thank you for any help.
Upvotes: 0
Views: 100
Reputation: 23667
$ perl -ne '($k,$v)=/^(.*\s)(\S+)$/; $h{$k} .= "$v,";
END{print "$_$h{$_}\n" foreach keys %h }' ip.txt
chr17 41246351 41246352 NM_007294_Exon_10,NM_007297_Exon_9,NM_007300_Exon_10,NR_027676_Exon_10,
chr17 41246352 41246353 NM_007294_Exon_10,NM_007297_Exon_9,NM_007300_Exon_10,
This leaves a trailing ,
though.. can be removed using sed 's/,$//'
Or use ?:
conditional to add comma as required (similar to logic used by @sat in awk solution), doesn't need post processing to remove trailing ,
$ perl -ne '($k,$v)=/^(.*\s)(\S+)$/; $h{$k} .= $h{$k}?",$v":"$v";
END{print "$_$h{$_}\n" foreach keys %h }' ip.txt
Upvotes: 1
Reputation: 14949
You can use this awk
:
awk '{i=$1 FS $2 FS $3} {a[i]=!a[i]?$4:a[i] FS $4} END {for (l in a) {print l,a[l]}}' file
If you want last column as comma separated,
awk '{i=$1 FS $2 FS $3} {a[i]=!a[i]?$4:a[i] "," $4} END {for (l in a) {print l,a[l]}}' file
Upvotes: 2