Reputation: 125

bash sort by last occurrence of key

I want to sort some files marked in the end of every line with @ plus a number. The problem is you can find the @ more than once per line. The file could be something like:

'Hello from line 2' @2
'Hello from line 3' @3
'Hi' @5 'Hello from line 1' @1

I want my output ordered like this:

'Hi' @5 'Hello from line 1' @1
'Hello from line 2' @2
'Hello from line 3' @3

But the @5 is obstructing it.I have tested:

cat myFile.txt | sort -t@ -k2

But @ is not in a specific column, is in the last. I have seen some solutions here in this site using awk but it seems to fail in my case.

Any help is welcome.

Upvotes: 0

Answers (5)

Logan Lee

Reputation: 987

Here's my solution

$ cat at.txt | sed -E 's/(@[0-9])$/D\1/' | sort -tD -k2,2 | tr -d 'D'
'Hi' @5 'Hello from line 1' @1
'Hello from line 2' @2
'Hello from line 3' @3

Here's the steps I took:

First we add 'D' before the last @ for each line.

'Hello from line 2' D@2
'Hello from line 3' D@3
'Hi' @5 'Hello from line 1' D@1

Then we sort by the second column with delimiter 'D'.

'Hi' @5 'Hello from line 1' D@1
'Hello from line 2' D@2
'Hello from line 3' D@3

Lastly, we remove 'D'.

'Hi' @5 'Hello from line 1' @1
'Hello from line 2' @2
'Hello from line 3' @3

Hope you found this useful.

Upvotes: 0

James Brown

Reputation: 37404

One in GNU awk that hashes records to a two-dimensional array a and for sorting uses PROCINFO["sorted_in"] to control for traversal. First a bit changed sample:

b@1
a@3
1@3
a@2
1@4
b@2
a@1
a@4

Then the program:

$ gawk 'BEGIN {
    FS="@"                                        # field separator
}
{
    a[$NF][++c[$NF]]=$0                           # hash records, 1st dim i the 
}                                                 # number, 2nd serial of each 
END {                                             # each number if duplicates
    PROCINFO["sorted_in"]="@ind_num_asc"          # 1st dim, sort in index value
    for(i in a) {
        PROCINFO["sorted_in"]="@val_str_asc"      # 2nd dim, sort on array value
        for(j in a[i])
            print a[i][j]
        # PROCINFO["sorted_in"]="@ind_num_asc"    # not sure if needed, seems like not
    }
}' file

Output:

a@1
b@1
a@2
b@2
1@3
a@3
1@4
a@4

... or with your data:

'Hi' @5 'Hello from line 1' @1
'Hello from line 2' @2
'Hello from line 3' @3

Upvotes: 1

RavinderSingh13

Reputation: 133458

Could you please try following with combination of rev + sort(written and tested with shown samples and after seeing Cyrus's comment digits are in single digit).

rev Input_file | sort -n | rev

Logical explanation:

Firstly printing Input_file in reverse order(from last character to first character).
Now last digits become first field of so passing its output to sort command to sort it by numbers.
Once its sorted then again using rev to make Input_file in its actual form.

Upvotes: 3

Shawn

Reputation: 52344

$ sed 's/@\([^@]*\)$/'$'\37''\1/' input.txt | sort -t $'\37' -k2,2n | tr $'\37' '@'
'Hi' @5 'Hello from line 1' @1
Hello from line 2' @2
'Hello from line 3' @3

This first replaces the last @ in each line with the ASCII unit separator character (Which is very unlikely to appear elsewhere in your input), sorts by the second column using US as the field delimiter, and then finally turns the US back into a @.

Upvotes: 0

Cyrus

Reputation: 88583

Schwartzian transform with awk and cut:

awk -F '@' '{print $NF,$0}' file | sort -n | cut -d " " -f 2-

$NF contains last column.

Output:

'Hi' @5 'Hello from line 1' @1
'Hello from line 2' @2
'Hello from line 3' @3

Upvotes: 4

bash sort by last occurrence of key

Answers (5)

Related Questions