Vim EX command to remove non-duplicate records

Question

I have a large file which I am trying to reduce to only neighboring duplicated record id lines. (It's been sorted already)

Example:

AB12345  10987654321 Andy   Male
AB12345  10987654321 Andrea Female
CD34567  98765432100 Andrea Female
EF45678  54321098765 Bobby  Tables

should remove lines 3-4 leaving lines 1-2.

The following regex pattern finds just the duplicate lines successfully, but the subsequent command removes some but not all of the non-matching lines.

:/\v^(\a{2}\d{5}\s{2}\d{11}).*
(\1.*)+
:g!/\v^(\a{2}\d{5}\s{2}\d{11}).*
(\1.*)+/d

Why aren't all the non-matching lines being deleted?

lcd047 · Accepted Answer

Not a Vim solution, but this should work:

$ fgrep -f <(awk -v OFS=' ' '{print $1, $2}' data.txt | sort | uniq -d) data.txt

The <(...) is a bashism, and OSF=' ' has exactly two spaces.

Vim EX command to remove non-duplicate records

Answers (2)

Related Questions