Sort contents of awk associative array element

Question

Originally, the file has its contents like this:

1.2.3.4: 1,3,4
1.2.3.5: 9,8,7,6
1.2.3.4: 4,5,6
1.2.3.6: 1,1,1

after I have tried sorting it incorrectly I have this:

1.2.3.4: 1,3,4,4,5,6,
1.2.3.5: 9,8,7,6,
1.2.3.6: 1,1,1,

I want to sort it into the following format:

1.2.3.4: 1,3,4,5,6
1.2.3.5: 6,7,8,9
1.2.3.6: 1

but how do I access each comma-delimited character in each element and sort them uniquely ascending deleting duplicates? The only shell script I have managed to use so far accesses the whole element only:

awk -F' ' 'NF>1{a[$1] = a[$1]$2","}END{for(i in a){print i" "a[i] | "sort -t: -k1 "}}' c.txt

Wintermute · Accepted Answer

EDIT: I took the intermediate data as input the first time around, when the original data was not yet posted, but of course it's also possible from the original data. Again with GNU awk:

gawk -F '[ ,]' 'BEGIN { PROCINFO["sorted_in"] = "@ind_num_asc" } { for(i = 2; i <= NF; ++i) a[$1][$i]; } END { for(ip in a) { line = ip " "; for(n in a[ip]) { line = line n "," } sub(/,$/, "", line); print line } }' filename

The code works as follows:

BEGIN { 
  PROCINFO["sorted_in"] = "@ind_num_asc"  # GNU-specific: sorted array
                                          # traversal
}
{
  for(i = 2; i <= NF; ++i) a[$1][$i]      # remember numbers by ip
}
END {                                     # in the end:
  for(ip in a) {                          # for all ips:
    line = ip " "                         # construct the line: IP
    for(n in a[ip]) {                     # numbers in order
      line = line n ","
    }
    sub(/,$/, "", line)                   # remove trailing comma
    print line                            # print the result.
  }
}

Old answer for intermediate data:

With GNU awk, assuming that the data is formatted precisely as in the question (with a trailing ,):

gawk -F '[ ,]' 'BEGIN { PROCINFO["sorted_in"] = "@ind_num_asc" } { delete a; for(i = 2; i < NF; ++i) a[$i]; line = $1 " "; for(i in a) line = line i ","; sub(/,$/, "", line); print line; }' filename

The file contents are split along spaces and commas, then the code works as follows:

BEGIN { 
  PROCINFO["sorted_in"] = "@ind_num_asc"  # GNU-specific: sorted array
                                          # traversal, numerically ascending
}
{
  delete a
  for(i = 2; i < NF; ++i) { a[$i] }       # remember the fields in a line.
                                          # duplicates are removed here.
                                          # note that it's < NF instead of
                                          # <= NF because the trailing comma
                                          # leaves us with an empty last
                                          # field.

  line = $1 " "                           # start building line: IP field
  for(i in a) {                           # append numbers separated by
    line = line i ","                     # commas
  }
  sub(/,$/, "", line)                     # remove last trailing comma
  print line                              # print result.
}

Sort contents of awk associative array element

Answers (1)

Old answer for intermediate data:

Related Questions