goosebump
goosebump

Reputation: 96

Custom sorting in Unix

I want to sort the following data in a particular order. I tried sort in different way but couldn't find any solution. please help. I am a newbie in Unix.:(

Data:-

method1:entry:2013.09.18.19.18.30
method1:exit:2013.09.18.19.18.30
method2:entry:2013.09.18.19.18.30
method2:exit:2013.09.18.19.18.30
method3:entry:2013.09.18.19.18.30
method4:entry:2013.09.18.19.18.30
method4:exit:2013.09.18.19.18.30
method1:entry:2013.09.18.19.18.30
method1:exit:2013.09.18.19.18.30
method3:exit:2013.09.18.19.18.30
method3:entry:2013.09.18.19.18.30
method5:entry:2013.09.18.19.18.30
method5:exit:2013.09.18.19.18.30
method3:exit:2013.09.18.19.18.30

Desired output:-

method1:entry:2013.09.18.19.18.30
method1:exit:2013.09.18.19.18.30
method1:entry:2013.09.18.19.18.30
method1:exit:2013.09.18.19.18.30
method2:entry:2013.09.18.19.18.30
method2:exit:2013.09.18.19.18.30
method3:entry:2013.09.18.19.18.30
method3:exit:2013.09.18.19.18.30
method3:entry:2013.09.18.19.18.30
method3:exit:2013.09.18.19.18.30
method4:entry:2013.09.18.19.18.30
method4:exit:2013.09.18.19.18.30
method5:entry:2013.09.18.19.18.30
method5:exit:2013.09.18.19.18.30

The sorting should be based on method name and 'entry-exit' occurrence.

Upvotes: 2

Views: 1371

Answers (2)

Ashish Gaur
Ashish Gaur

Reputation: 2050

Try this :

sed -e 's/:/ /g' file.txt | sort |
awk 'BEGIN { var_entry="entry"; var_exit="exit"; flag="entry" }
    { if (flag == $2 && var_entry ==$2 ){
        i = 0; flag=var_exit; }
      else if (flag == $2 && var_exit == $2 ){
        i = 0; flag=var_entry; };
      i++ ; print i, $0 }' |
sort -t" " -k 2,2 -k 1,1 | sed 's/^[0-9]* //g'

The logic behind this is :

  1. sed -e 's/:/ /g replaces : with a space so the delimiters are consistent.

  2. sort simply sorts on method1 column.

  3. awk step appends another column so that we can sort on that column so that we have a pattern like entry exit for matching method1, output is :

    1 method1 entry 2013.09.18.19.18.30
    2 method1 entry 2013.09.18.19.18.30
    1 method1 exit 2013.09.18.19.18.30
    2 method1 exit 2013.09.18.19.18.30
    1 method2 entry 2013.09.18.19.18.30
    1 method2 exit 2013.09.18.19.18.30
    1 method3 entry 2013.09.18.19.18.30
    2 method3 entry 2013.09.18.19.18.30
    1 method3 exit 2013.09.18.19.18.30
    2 method3 exit 2013.09.18.19.18.30
    1 method4 entry 2013.09.18.19.18.30
    1 method4 exit 2013.09.18.19.18.30
    1 method5 entry 2013.09.18.19.18.30
    1 method5 exit 2013.09.18.19.18.30

  4. sort -t" " -k 2,2 -k 1,1 : then we sort on method1 column (2nd column) and if there are conflicts we resolve them on the newly added column viz. 1st column. output is :

    1 method1 entry 2013.09.18.19.18.30
    1 method1 exit 2013.09.18.19.18.30
    2 method1 entry 2013.09.18.19.18.30
    2 method1 exit 2013.09.18.19.18.30
    1 method2 entry 2013.09.18.19.18.30
    1 method2 exit 2013.09.18.19.18.30
    1 method3 entry 2013.09.18.19.18.30
    1 method3 exit 2013.09.18.19.18.30
    2 method3 entry 2013.09.18.19.18.30
    2 method3 exit 2013.09.18.19.18.30
    1 method4 entry 2013.09.18.19.18.30
    1 method4 exit 2013.09.18.19.18.30
    1 method5 entry 2013.09.18.19.18.30
    1 method5 exit 2013.09.18.19.18.30

  5. sed 's/^[0-9]* //g' : we remove the extra column which was created.

Upvotes: 1

chepner
chepner

Reputation: 532238

It appears you simply want to sort by method name, which is the first colon-delimited field.

sort -t: -s -k1,1 file.txt

The -s flag (stable sort) prevents sort from modifying the relative order of lines with the same first field.

Upvotes: 2

Related Questions