Reputation: 91
searching for a solution to my problem I found this thread: Sorting on the last field of a line
I used some solutions for sorting using sed
and awk
and they work, one more thing I need is to delete all but one line based on last string in a line.
Example, I have:
www.site.com/324242_1234
www.site.com/233_1234
www.site.com/45357_1234
www.site.com/6545_2345
www.site.com/5433_2345
www.site.com/87745_456
www.site.com/453209_456
www.site.com/1345_456
Need this result:
www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456
So I need to keep only one line that contains that last string, in this example they are separated by underline. I appreciate help.
Upvotes: 1
Views: 248
Reputation: 67537
$ sort -t_ -u -k2 file
www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456
assumes there are no preceding underscores.
awk
solution can be
$ awk -F_ '!a[$NF]++' file
www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456
Explanation After setting the field delimiter, $NF
refers to the last field, a[$NF]++
counts the occurrences of each value starting with zero. !a[$NF]++
negates the value, so it will only be true when the count is zero, which is the first instance of the key value looked. This site has many examples of this awk
idiom.
Upvotes: 2
Reputation: 18411
awk -F[_/] '{print $NF,$(NF-1),$0}' input_file |sort -r -nk2,1 |awk '!a[$1]++{gsub($1FS$2,"");gsub(/^ /,"");print}'
www.site.com/87745_456
www.site.com/6545_2345
www.site.com/45357_1234
Upvotes: 1
Reputation: 563
How about this?
cat file | sed -e 's/_/\t/gi' | sort -uk 2,2 | sed -e 's/\t/_/gi'
Where file has the strings
Upvotes: 1