Cody Patterson
Cody Patterson

Reputation: 138

Sorting Lines Within Blocks in Awk

I have a very long file that contains a list of dependencies, their versions, and the service that the dependency belongs to. This file is sorted and separated by blocks.

Here is a snippet from the file file I'm referring to:

foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
hi.bye:beatles:jar:1.15:compile service: ServiceThree
@

If you'll notice: the version numbers are somewhat sorted from highest to lowest within each block of dependencies. I'm trying to write an awk script that will sort each line within their respective blocks from highest version number to lowest version number. Here is what the output should looks like:

foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

Note: The service names in the output do not need to be in any particular order. As long as the versions are sorted from greatest to lowest.

Logically I'm thinking that I should set RS="@" and create an array that will contain each line within that block, then sort those arrays by the version number and print them. The issue is, I don't know how to sort them by their version numbers. Here is what I have so far in my awk script:

BEGIN {
    RS = "@";
}
{
    split($0, lines, "\n");

    # sort the array by the version number from highest to lowest
    # <--- I need help here

    for(key in lines) { print lines[key]; }
    delete lines;
}
END {
}

If this is completely off base I'm open to trying new approaches. Any help with this issue would be greatly appreciated!

Upvotes: 3

Views: 303

Answers (3)

James Brown
James Brown

Reputation: 37424

Using GNU awk:

$ awk '
BEGIN {
    FS=":"
    PROCINFO["sorted_in"]="@ind_num_desc"  # for array processes order
}
$0=="@" {                                  # at the end of a block
    for(i in a)                            # order every array dimension
        for(j in a[i])
            for(k in a[i][j])
                for(l in a[i][j][k])
                    print a[i][j][k][l]    # output
     print "@"                             # block separator
     delete a                              # delete array 
     next                                  # skip to next block
}
{
     split($4,b,".")                       # separate version depths
     a[b[1]][b[2]][b[3]][--c]=$0           # hash to a
}' file
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

What was supposed to be a quick and beautiful walk in a park turned out a nasty hack.

Upvotes: 1

karakfa
karakfa

Reputation: 67507

here is another awk

$ awk '/^@/{close(cmd); print; next} 
           {cmd="sort -rV"; print | cmd}' file

foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

Upvotes: 4

Ed Morton
Ed Morton

Reputation: 204174

With GNU sort for version sort:

$ awk -F':' -v OFS='\t' 'NF==1{c++} {print c+1, $4, $0}' file  | sort -k1n -k2rV | cut -f3-
foo.bar.baz:json:jar:2.2.10:compile service: ServiceThree
foo.bar.baz:json:jar:2.2.2:compile service: ServiceTwo
@
asm:asm:jar:3.3.1:compile service: ServiceTwo
asm:asm:jar:3.3.1:compile service: ServiceOne
asm:asm:jar:3.3.0:compile service: ServiceThree
@
hi.bye:beatles:jar:1.15:compile service: ServiceThree
hi.bye:beatles:jar:1.6:compile service: ServiceOne
hi.bye:beatles:jar:1.5:compile service: ServiceTwo
@

Upvotes: 4

Related Questions