user9013730
user9013730

Reputation:

Search duplicate lines in a file, count it and which location (line number) without sorting it?

I've been reading about similar question here, however the answer was provided without line number.

[root@test ~]# cat -n file 
     1  123 
     2  123 
     3  234 
     4  234 
     5  123 
     6  345
[root@test ~]#

[root@test ~]# sort file | uniq -c
      3 123 
      2 234 
      1 345
[root@test ~]# 

What I'm looking for is something like this, but in Linux shell script (preferable), or any other scripting solutions.

Output provided by textmechanic.com

( 2 dupe of 1 ): 123 
( 4 dupe of 3 ): 234 
( 5 dupe of 1 ): 123 

Upvotes: 2

Views: 77

Answers (1)

anubhava
anubhava

Reputation: 785856

You may use awk:

awk '{if ($1 in a) printf "( %d dupe of %d ): %s\n", NR, a[$1], $1; else a[$1] = NR}' file

( 2 dupe of 1 ): 123
( 4 dupe of 3 ): 234
( 5 dupe of 1 ): 123

Upvotes: 2

Related Questions