justin silverbaum
justin silverbaum

Reputation: 1

Sort file by field in first line of block of lines using awk

Preferably awk but other scripting ok. ("{{{1" on line ends are for vim folding and should be irrelevant to the desired output.) The data in the example comes from online chatrooms, that I pasted into a text file.

Begin file contents:
-----------------------------------------------
/mysql    unauth'd user on show processlist, plus db conn error in a site {{{1

1:05 
Gary
can you ck belljar20 instance?

1:06
Justin looks like reverse dns issue

-----------------------------------------------
/mysql    pingtimes to db server solved by adding domain to /etc/hosts on db server {{{1
per internal wiki
...

-----------------------------------------------
/php54    back to php52 with manual fix for https {{{1
Gary 
can u force mkp44aaa.net to bind to an ip address?
...

-----------------------------------------------
:End file contents

The records, aka blocks (of varying numbers of lines) start with a one word "/category" as the first word of the first line, after a beginning forward slash "/", and end with a line of about 40 dashes. Above, in the 3 block examples, are two with category "/mysql" and one with category "php54".

In the example above I would like the output to have been sorted so the two "/mysql" category blocks are next to each other in the sorted output.

So, essentially, just sort the blocks by the category name.

I've seen lots of components of a solution but just cannot seem to find one that is on point enough for me to adapt it.

Upvotes: 0

Views: 408

Answers (1)

Tiago Lopo
Tiago Lopo

Reputation: 7959

If you can use perl:

#! /bin/bash

input=/tmp/file

perl -n0le '
    while (s/(\/\w+(.|\n)*?-+)//m){
        $str=$1; $str=~/(\/\w+)/;
        $h{$1}=[] unless exists $h{$1};
        push @{h{$1}},$str;
    }
    END{
        foreach $key (sort keys %h){
            foreach ( @{$h{$key}} ){
                print $_."\n";
            }
        }
    }' $input

Explanation:

There's a lot of things going on there, first of all we want a multi-line match, that's why we're using -0 which puts the whole content of the input file into $_.

Then we want extract our pattern "(\/\w+(.|\n)*?-+)" create a hash of arrays with key being "/category". At end we sort based on that key and print.

Output:

bash test.sh 
/aaa
this is a test

-----------------------------------------------
/mysql    unauth'd user on show processlist, plus db conn error in a site {{{1

1:05 
Gary
can you ck belljar20 instance?

1:06
Justin looks like reverse dns issue

-----------------------------------------------
/mysql    pingtimes to db server solved by adding domain to /etc/hosts on db server {{{1
per internal wiki
...

-----------------------------------------------
/php54    back to php52 with manual fix for https {{{1
Gary 
can u force mkp44aaa.net to bind to an ip address?
...

-----------------------------------------------

Upvotes: 1

Related Questions