How to delete lines with a duplicate numbers

Question

I want to delete any lines that have the same number at end, for example:

Input:

abc 77777
rgtds 77777
aswa 77777
gdf 845
sdf 845
ytn 963
fgnb 963

Output:

abc 77777
gdf 845
ytn 963

Note: every line with a same number most deleted and one of all the lines that had the same number must stay.

I want to convert this text file to my output:

Input:

 c:/files/company/aj/psohz.mp4 905
 c:/files/company/rs/oxija.mp4 905
 c:/files/company/nw/kzlkg.mp4 905
 c:/files/company/wn/wpqov.mp4 905
 c:/files/company/qi/jzdjg.mp4 905
 c:/files/company/kq/dadfr..mp4 905
 c:/files/company/kp/xmpye.jpg 7839
 c:/files/company/fx/jszmn.jpg 7839
 c:/files/company/me/plsqx.mp4 7839
 c:/files/company/xm/uswjb.mp4 7839
 c:/files/company/ay/pnnhu.pdf 8636184
 c:/files/company/os/glwou.pdf 8636184
 c:/files/company/px/kucdu.pdf 8636184

Output:

 c:/files/company/kq/dadfr..mp4 905
 c:/files/company/kp/xmpye.jpg 7839
 c:/files/company/ay/pnnhu.pdf 8636184

Benjamin W. · Accepted Answer

If you know that there are always just two columns (i.e., no blanks in the filename) and that the lines with the same number are always in the same block, you can use uniq:

$ uniq -f1 infile
 c:/files/company/aj/psohz.mp4 905
 c:/files/company/kp/xmpye.jpg 7839
 c:/files/company/ay/pnnhu.pdf 8636184

-f1 says to ignore the first field when asserting uniqueness.

If you don't know about blanks, and the same numbers might be anywhere in the file, you can use awk:

$ awk '!a[$NF]++' infile
 c:/files/company/aj/psohz.mp4 905
 c:/files/company/kp/xmpye.jpg 7839
 c:/files/company/ay/pnnhu.pdf 8636184

This counts the number of occurrences of the last field of each line, and if that number is zero before incrementing, the line gets printed. It's a compact way of expressing

awk '{ if (a[$NF] == 0) { print; a[$NF] += 1 } }' infile

How to delete lines with a duplicate numbers

Answers (2)

Related Questions