Reputation: 91

Sorting lines and removing all but one line based on the last string?

searching for a solution to my problem I found this thread: Sorting on the last field of a line

I used some solutions for sorting using sed and awk and they work, one more thing I need is to delete all but one line based on last string in a line.

Example, I have:

www.site.com/324242_1234
www.site.com/233_1234
www.site.com/45357_1234
www.site.com/6545_2345
www.site.com/5433_2345
www.site.com/87745_456
www.site.com/453209_456
www.site.com/1345_456

Need this result:

www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456

So I need to keep only one line that contains that last string, in this example they are separated by underline. I appreciate help.

Upvotes: 1

Answers (3)

karakfa

Reputation: 67537

$ sort -t_ -u -k2 file

www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456

assumes there are no preceding underscores.

awk solution can be

$ awk -F_ '!a[$NF]++' file

www.site.com/324242_1234
www.site.com/6545_2345
www.site.com/87745_456

Explanation After setting the field delimiter, $NF refers to the last field, a[$NF]++ counts the occurrences of each value starting with zero. !a[$NF]++ negates the value, so it will only be true when the count is zero, which is the first instance of the key value looked. This site has many examples of this awk idiom.

Upvotes: 2

P....

Reputation: 18411

awk -F[_/] '{print $NF,$(NF-1),$0}' input_file |sort -r -nk2,1   |awk '!a[$1]++{gsub($1FS$2,"");gsub(/^ /,"");print}'
www.site.com/87745_456
www.site.com/6545_2345
www.site.com/45357_1234

Upvotes: 1

Kunal B.

Reputation: 563

How about this?

cat file | sed -e 's/_/\t/gi' | sort -uk 2,2 | sed -e 's/\t/_/gi'

Where file has the strings

Upvotes: 1

Sorting lines and removing all but one line based on the last string?

Answers (3)

Related Questions