Reputation: 79
cat file1.txt
abc bcd abc ...
abcd bcde cdef ...
abcd bcde cdef ...
abcd bcde cdef ...
efg fgh ...
efg fgh ...
hig ...
My expected result is like as below:
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, above sentence has repeated 3 times !!!>
hig ...
I have found a way to deal with the issues, but my code is a little noisy.
cat file1.txt | uniq -c | sed -e 's/ \+/ /g' -e 's/^.//g' | awk '{print $0," ",$1}'| sed -e 's/^[2-9] /\n/g' -e 's/^[1] //g' |sed -e 's/[^1]$/\n<!!! pay attention, above sentence has repeated & times !!!> \n/g' -e 's/[1]$//g'
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, above sentence has repeated 2 times !!!>
hig ...
I was wondering if you could show me more high-efficiency way to achieve the goal.Thanks a lot.
Upvotes: 1
Views: 73
Reputation: 203899
$ awk '
$0==prev { cnt++; next }
{ prt(); prev=$0; cnt=1 }
END { prt() }
function prt() {
if (NR>1) print prev (cnt>1 ? ORS "repeated " cnt " times" : "") ORS
}
' file
abc bcd abc ...
abcd bcde cdef ...
repeated 3 times
efg fgh ...
repeated 2 times
hig ...
Upvotes: 2
Reputation: 247002
If you're lines are not already grouped, then you could use
awk '
NR == FNR {count[$0]++; next}
!seen[$0]++ {
print
if (count[$0] > 1)
print "... repeated", count[$0], "times"
}
' file1.txt file1.txt
This will consume a lot of memory if your file is very large. You might want to sort it first.
Upvotes: 1
Reputation: 92854
sort
+ uniq
+ sed
solution:
sort file1.txt | uniq -c | sed -E 's/^ +1 (.+)/\1\n/;
s/^ +([2-9]|[0-9]{2,}) (.+)/\2\n<!!! pay attention, the above sentence has repeated \1 times !!!>\n/'
The output:
abc bcd abc ...
abcd bcde cdef ...
<!!! pay attention, the above sentence has repeated 3 times !!!>
efg fgh ...
<!!! pay attention, the above sentence has repeated 2 times !!!>
hig ...
Or with awk
:
sort file1.txt | uniq -c | awk '{ n=$1; sub(/^ +[0-9]+ +/,"");
printf "%s\n%s",$0,(n==1? ORS:"<!!! pay attention, the above sentence has repeated "n" times !!!>\n\n") }'
Upvotes: 2