Reputation: 8093
I have a file like this
$> cat testfile.txt
abc_xyz_2a foo
dft_pqr_abc_5c bar
pqr_ijk_1a alpha
efg_5b beta
ijk_pqr_5a gamma
pqr_ijk_1b alpha
I want to sort the rows based on last value of first column, after last underscore _
, like 1a
,2a
,5a
,5b
,5c
So this is my expected output.
pqr_ijk_1a alpha
pqr_ijk_1b alpha
abc_xyz_2a foo
ijk_pqr_5a gamma
efg_5b beta
dft_pqr_abc_5c bar
Could someone please suggest a way to achieve the expected output?
What I tried
I have tried extracting the part after last underscore of first column and sort, but that would only print those keywords, not the whole line.
$> awk '{print $1}' testfile.txt|rev|awk -F_ '{print $1}'|rev|sort
1a
2a
5a
5b
5c
I guess there could be a way to hold/note the line numbers somehow and output based on that? I tried some hit and trial using NR
in awk unsuccessfully.
Edit: Added a row in file ending with 1b
to handle another case. Changed expected output based on it.
Upvotes: 4
Views: 175
Reputation: 83
You can try below command which is much simpler and straight forward.
rev test.txt | sort -k2 | rev
pqr_ijk_1a alpha
abc_xyz_2a foo
ijk_pqr_5a gamma
efg_5b beta
dft_pqr_abc_5c bar
Upvotes: 0
Reputation: 8769
Just remove out the required columns, sort it and then remove it again.
$ cat data
abc_xyz_2a foo
dft_pqr_abc_5c bar
pqr_ijk_1a alpha
efg_5b beta
ijk_pqr_5a gamma
$ awk '{print substr($1, length($1)-1, 1), substr($1, length($1)), $1, $2}' data | sort -n -k1,2 | awk '{print $3,$4}'
pqr_ijk_1a alpha
abc_xyz_2a foo
ijk_pqr_5a gamma
efg_5b beta
dft_pqr_abc_5c bar
Here is what happens at each step of the pipeline:
$ awk '{print substr($1, length($1)-1, 1), substr($1, length($1)), $1, $2}' data
2 a abc_xyz_2a foo
5 c dft_pqr_abc_5c bar
1 a pqr_ijk_1a alpha
5 b efg_5b beta
5 a ijk_pqr_5a gamma
$ awk '{print substr($1, length($1)-1, 1), substr($1, length($1)), $1, $2}' data | sort -n -k1,2
1 a pqr_ijk_1a alpha
2 a abc_xyz_2a foo
5 a ijk_pqr_5a gamma
5 b efg_5b beta
5 c dft_pqr_abc_5c bar
Upvotes: 2
Reputation: 785058
If you have gnu-awk
then you can use PROCINFO
way of sorting an array:
awk 'BEGIN{PROCINFO["sorted_in"] = "@ind_num_asc"} {
n=split($1, a, "_")
data[a[n]]=$0
}
END {
for (i in data)
print data[i]
}' file
pqr_ijk_1a alpha
abc_xyz_2a foo
ijk_pqr_5a gamma
efg_5b beta
dft_pqr_abc_5c bar
Otherwise you can use awk-sort-cut
pipeline:
awk '{n=split($1, a, "_"); print $0 "\0" a[n]}' file | sort -t '\0' -k2 | cut -d $'\0' -f1
pqr_ijk_1a alpha
abc_xyz_2a foo
ijk_pqr_5a gamma
efg_5b beta
dft_pqr_abc_5c bar
Upvotes: 2