Reputation: 1843
I have a large file which I am trying to reduce to only neighboring duplicated record id lines. (It's been sorted already)
Example:
AB12345 10987654321 Andy Male
AB12345 10987654321 Andrea Female
CD34567 98765432100 Andrea Female
EF45678 54321098765 Bobby Tables
should remove lines 3-4 leaving lines 1-2.
The following regex pattern finds just the duplicate lines successfully, but the subsequent command removes some but not all of the non-matching lines.
:/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+
:g!/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+/d
Why aren't all the non-matching lines being deleted?
Upvotes: 0
Views: 72
Reputation: 5408
There's no "magic" version of :global
Possible solutions: escape special characters as this
:g!/^(\a\{2}\d\{5}\s\{2}\d\{11}).*\n(\1.*)\+/d
.
You can always reuse previous find pattern, and use it like this g://d
Extra links
Upvotes: 1
Reputation: 5861
Not a Vim solution, but this should work:
$ fgrep -f <(awk -v OFS=' ' '{print $1, $2}' data.txt | sort | uniq -d) data.txt
The <(...)
is a bashism, and OSF=' '
has exactly two spaces.
Upvotes: 1