Reputation: 2093
I have a file that contains lines as follows:
one one
one one
two two two
one one
three three
one one
three three
four
I want to remove all occurrences of the duplicate lines from the file and leave only the non-duplicate lines. So, in the example above, the result should be:
two two two
four
I saw this answer to a similar looking question. I tried to modify the ex one-liner as given below:
:syn clear Repeat | g/^\(.*\)\n\ze\%(.*\n\)*\1$/exe 'syn match Repeat "^' . escape(getline ('.'), '".\^$*[]') . '$"' | d
But it does not remove all occurrences of the duplicate lines, it removes only some occurrences.
How can I do this in vim? or specifically How can I do this with ex in vim?
To clarify, I am not looking for sort u
.
Upvotes: 6
Views: 5032
Reputation: 1
Upvotes: 0
Reputation: 129
please use perl ,perl can do it easily !
use strict;use warnings;use diagnostics;
#read input file
open(File1,'<input.txt') or die "can not open file:$!\n";my @data1=<File1>;close(File1);
#save row and count number of row in hash
my %rownum;
foreach my $line1 (@data1)
{
if (exists($rownum{$line1}))
{
$rownum{$line1}++;
}
else
{
$rownum{$line1}=1;
}
}
#if number of row in hash =1 print it
open(File2,'>output.txt') or die "can not open file:$!\n";
foreach my $line1 (@data1)
{
if($rownum{$line1}==1)
{
print File2 $line1;
}
}
close(File2);
Upvotes: -1
Reputation: 172648
My PatternsOnText plugin version 1.30 now has a
:DeleteAllDuplicateLinesIgnoring
command. Without any arguments, it'll work as outlined in your question.
Upvotes: 1
Reputation: 5122
It does not preserve the order of the remaining lines, but this seems to work:
:sort|%s/^\(.*\)\n\%(\1\n\)\+//
(This version is @Peter Rincker's idea, with a little correction from me.) On vim 7.3, the following even shorter version works:
:sort | %s/^\(.*\n\)\1\+//
Unfortunately, due to differences between the regular-expression engines, this no longer works in vim 7.4 (including patches 1-52).
Upvotes: 1
Reputation: 5122
This is not any simpler than @Ingo Karkat's answer, but it is a little more flexible. Like that answer, this leaves the remaining lines in the original order.
function! RepeatedLines(...)
let first = a:0 ? a:1 : 1
let last = (a:0 > 1) ? a:2 : line('$')
let lines = []
for line in range(first, last - 1)
if index(lines, line) != -1
continue
endif
let newlines = []
let text = escape(getline(line), '\')
execute 'silent' (line + 1) ',' last
\ 'g/\V' . text . '/call add(newlines, line("."))'
if !empty(newlines)
call add(lines, line)
call extend(lines, newlines)
endif
endfor
return sort(lines)
endfun
:for x in reverse(RepeatedLines()) | execute x 'd' | endfor
A few notes:
:help list-functions
/\V
(very no magic) so the only character I need to escape in a search pattern is the backslash itself. :help /\V
Upvotes: 0
Reputation: 5948
If you have access to UNIX-style commands, you could do:
:%!sort | uniq -u
The -u
option to the uniq
command performs the task you require. From the uniq
command's help text:
-u, --unique
only print unique lines
I should note however that this answer assumes that you don't mind that the output doesn't match any sort order that your input file might have already.
Upvotes: 5
Reputation: 172648
Taking the code from here and modifying it to delete the lines instead of highlighting them, you'll get this:
function! DeleteDuplicateLines() range
let lineCounts = {}
let lineNum = a:firstline
while lineNum <= a:lastline
let lineText = getline(lineNum)
if lineText != ""
if has_key(lineCounts, lineText)
execute lineNum . 'delete _'
if lineCounts[lineText] > 0
execute lineCounts[lineText] . 'delete _'
let lineCounts[lineText] = 0
let lineNum -= 1
endif
else
let lineCounts[lineText] = lineNum
let lineNum += 1
endif
else
let lineNum += 1
endif
endwhile
endfunction
command! -range=% DeleteDuplicateLines <line1>,<line2>call DeleteDuplicateLines()
Upvotes: 0
Reputation: 196751
Assuming you are on an UNIX derivative, the command below should do what you want:
:sort | %!uniq -u
uniq
only works on sorted lines so we must sort them first with Vim's buit-in :sort
command to save some typing (it works on the whole buffer by default so we don't need to pass it a range and it's a built-in command so we don't need the !
).
Then we filter the whole buffer through uniq -u
.
Upvotes: 2
Reputation: 195169
if you are on linux box with awk available, this line works for your needs:
:%!awk '{a[$0]++}END{for(x in a)if(a[x]==1)print x}'
Upvotes: 3