Anuj Kulkarni
Anuj Kulkarni

Reputation: 137

how to merge rows ina csv file if if the first column is same in bash

I have written a program to find the open ports of a given IP. since the print function is in a loop my output format is like this:

IP1,22
IP1,23
IP1,24
IP2,25
IP2,26
IP3,27
IP3,30
IP3,45

How do I get it in this format:

IP1,22,23,24
IP2,25,26
IP3,27,30,45

EDIT: this is what I have done so far

awk'{a[$1]=(a[$1])? a[$1]r : r }

but I dont know how to progress forward from here.

Upvotes: 1

Views: 204

Answers (3)

Unix Lover
Unix Lover

Reputation: 1

sort anuj.txt | cut -d, -f1 | sort -u  | while read line; do echo $line,$(grep $line anuj.txt | cut -d, -f2 | paste -sd, -); done

Assumptions: Your file is comma separated and has the same format you show above.

Explanation (FWIW): First find the unique IP identifiers, then grep for that string in the file, for each line found get the second field and concatenate them.

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133518

Kindly always do add your efforts in your question in code tags. If you are not worried for order of output then try following.

awk 'BEGIN{FS=OFS=","} {a[$1]=($1 in a ? a[$1] OFS : "") $2} END{for(i in a){print i,a[i]}}'  Input_file

In case you need to get output in same order in which 1st field of Input_file is coming then try following.

awk '
BEGIN{
  FS=OFS=","
}
!b[$1]++{
  c[++count]=$1
}
{
  a[$1]=($1 in a ? a[$1] OFS : "") $2
}
END{
  for(i=1;i<=count;i++){
    print c[i],a[c[i]]
  }
}'  Input_file

Since later OP found out control-m characters were found in Input_file(s) too so adding following:

tr -d '\r' < Input_file > temp && mv temp Input_file

Upvotes: 3

stack0114106
stack0114106

Reputation: 8711

Another awk..

$ cat anuj.txt
IP1,22
IP1,23
IP1,24
IP2,25
IP2,26
IP3,27
IP3,30
IP3,45
$ awk -F, ' { c=$1; if(c!=p) { printf("\n%s,%d",$1,$2)} else { printf(",%d",$2) } p=c } END { print ""} ' anuj.txt

IP1,22,23,24
IP2,25,26
IP3,27,30,45
$

There is an empty line in the above solution. To get rid of it, you can use below

$ awk -F, ' { c=$1; if(c!=p) { printf("%s%s,%d",sep,$1,$2)} else { printf(",%d",$2) } p=c;sep="\n" } END { print ""} ' anuj.txt
IP1,22,23,24
IP2,25,26
IP3,27,30,45
$

Upvotes: 2

Related Questions