OXXO
OXXO

Reputation: 724

Sort in ascending and descending column by defined blocks of rows

I am trying to sort column 2 considering blocks of 5 rows all time. E.g. First block rows 1 to 5 : Sort column 2 in ascending order Second block rows 6 to 10 : Sort column 2 in descending order

Do this operation in all file

Input file

P 45683.00  39785.00 1 12 
P 45685.00  39785.00 1 12 
P 45687.00  39785.00 1 12 
P 45689.00  39785.00 1 12 
P 45691.00  39785.00 1 12 
P 45683.00  39795.00 1 12 
P 45685.00  39795.00 1 12 
P 45687.00  39795.00 1 12 
P 45689.00  39795.00 1 12 
P 45691.00  39795.00 1 12 
P 45683.00  39805.00 1 12 
P 45685.00  39805.00 1 12 
P 45687.00  39805.00 1 12 
P 45689.00  39805.00 1 12 
P 45691.00  39805.00 1 12 
P 45683.00  39815.00 1 12 
P 45685.00  39815.00 1 12 
P 45687.00  39815.00 1 12 
P 45689.00  39815.00 1 12 
P 45691.00  39815.00 1 12

desired output

P 45683.00  39785.00 1 12 
P 45685.00  39785.00 1 12 
P 45687.00  39785.00 1 12 
P 45689.00  39785.00 1 12 
P 45691.00  39785.00 1 12 
P 45691.00  39795.00 1 12 
P 45689.00  39795.00 1 12 
P 45687.00  39795.00 1 12 
P 45685.00  39795.00 1 12 
P 45683.00  39795.00 1 12 
P 45683.00  39805.00 1 12 
P 45685.00  39805.00 1 12 
P 45687.00  39805.00 1 12 
P 45689.00  39805.00 1 12 
P 45691.00  39805.00 1 12 
P 45691.00  39815.00 1 12 
P 45689.00  39815.00 1 12 
P 45687.00  39815.00 1 12 
P 45685.00  39815.00 1 12 
P 45683.00  39815.00 1 12 

attempts

awk '/45691.00/{"awk \\$0+0==\\$0 "file | getline x}
{print x"~"FNR"~"$0 | "sort -k2,2n "}'

Thanks in advance

Upvotes: 1

Views: 169

Answers (3)

Enlico
Enlico

Reputation: 28480

Your sample input file has the following characteristics:

  • The lines 1-5,11-15,... are already sorted
  • The lines 6-10,16-20,... are upsidedown

If this is the case, then the following (totally ugly and non reusable, ahahah) command should be enough:

< file1 sed -E 'N;N;N;N;N;N;N;N;N;s/^(.*)\n(.*)\n(.*)\n(.*)\n(.*)\n(.*)$/\1\n\6\n\5\n\4\n\3\n\2/' > file2.out

Upvotes: 1

James Brown
James Brown

Reputation: 37424

Using GNU awk and asort():

$ gawk '
function process() {
    asort(a,a,(o=="@ind_num_asc" ? o="@ind_num_desc" : o="@ind_num_asc"))
    for(i in a)
        print a[i]
    delete a
}
{
    a[$2]=a[$2] (a[$2]==""?"":ORS) $0
}
NR%5==0 {
    process()
}
END {
    process()
}' file

Output:

P 45683.00  39785.00 1 12 
P 45685.00  39785.00 1 12 
P 45687.00  39785.00 1 12 
P 45689.00  39785.00 1 12 
P 45691.00  39785.00 1 12 
P 45691.00  39795.00 1 12 
P 45689.00  39795.00 1 12 
P 45687.00  39795.00 1 12 
P 45685.00  39795.00 1 12 
P 45683.00  39795.00 1 12 
P 45683.00  39805.00 1 12 
P 45685.00  39805.00 1 12 
P 45687.00  39805.00 1 12 
P 45689.00  39805.00 1 12 
P 45691.00  39805.00 1 12 
P 45691.00  39815.00 1 12
P 45689.00  39815.00 1 12 
P 45687.00  39815.00 1 12 
P 45685.00  39815.00 1 12 
P 45683.00  39815.00 1 12 

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 204164

With GNU awk for sorted_in:

$ cat tst.awk
{ block[$2] = block[$2] $0 ORS }
!(NR % 5) { prt() }
END { prt() }

function prt(   i,j) {
    PROCINFO["sorted_in"] = "@ind_num_" ( (++inst) % 2 ? "asc" : "desc" )
    for (i in block) {
        printf "%s", block[i]
    }
    delete block
}

.

$ awk -f tst.awk file
P 45683.00  39785.00 1 12
P 45685.00  39785.00 1 12
P 45687.00  39785.00 1 12
P 45689.00  39785.00 1 12
P 45691.00  39785.00 1 12
P 45691.00  39795.00 1 12
P 45689.00  39795.00 1 12
P 45687.00  39795.00 1 12
P 45685.00  39795.00 1 12
P 45683.00  39795.00 1 12
P 45683.00  39805.00 1 12
P 45685.00  39805.00 1 12
P 45687.00  39805.00 1 12
P 45689.00  39805.00 1 12
P 45691.00  39805.00 1 12
P 45691.00  39815.00 1 12
P 45689.00  39815.00 1 12
P 45687.00  39815.00 1 12
P 45685.00  39815.00 1 12
P 45683.00  39815.00 1 12

If you actually want to print every time $3 changes instead of every 5 lines then just change:

{ block[$2] = block[$2] $0 ORS }
!(NR % 5) { prt() }

to:

$3 != prev { prt(); prev=$3 }
{ block[$2] = block[$2] $0 ORS }

Upvotes: 3

Related Questions