Remove duplicate data from text file based on specific repeating crirteria

Question

I have a text file on which i want to remove some lines . The example contents of file is below--

v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
------------------
and so on

as its seen above 1.1 and 10.2 value is repeating several times , i want to preserve first 10 lines of 1.1 and 10.2 and lot like them ( these values are different and in hundred of different numbers ) but delete all subsequent duplicates even though the value of v parameter is different every time and also want to preserve non repeating data.

I tries sort with uniq but it only eliminates same matching duplicates but not based on specific condition .

sort file.txt | uniq -i

Ed Morton · Accepted Answer

Sounds like all you need is:

awk '++cnt[$NF]<11' file

e.g.

$ cat file
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
v7 has output 1.1
v8 has output 10.2
v9 has output 5.4
v10 has output 1.1
v11 has output 10.2
v12 has output 12

$ awk '++cnt[$NF]<3' file
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
v9 has output 5.4
v12 has output 12

Remove duplicate data from text file based on specific repeating crirteria

Answers (2)

Related Questions