psaxton
psaxton

Reputation: 1843

Vim EX command to remove non-duplicate records

I have a large file which I am trying to reduce to only neighboring duplicated record id lines. (It's been sorted already)

Example:

AB12345  10987654321 Andy   Male
AB12345  10987654321 Andrea Female
CD34567  98765432100 Andrea Female
EF45678  54321098765 Bobby  Tables

should remove lines 3-4 leaving lines 1-2.

The following regex pattern finds just the duplicate lines successfully, but the subsequent command removes some but not all of the non-matching lines.

:/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+
:g!/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+/d

Why aren't all the non-matching lines being deleted?

Upvotes: 0

Views: 72

Answers (2)

hawk
hawk

Reputation: 5408

There's no "magic" version of :global Possible solutions: escape special characters as this :g!/^(\a\{2}\d\{5}\s\{2}\d\{11}).*\n(\1.*)\+/d.

You can always reuse previous find pattern, and use it like this g://d

Extra links

Upvotes: 1

lcd047
lcd047

Reputation: 5861

Not a Vim solution, but this should work:

$ fgrep -f <(awk -v OFS=' ' '{print $1, $2}' data.txt | sort | uniq -d) data.txt

The <(...) is a bashism, and OSF=' ' has exactly two spaces.

Upvotes: 1

Related Questions