Sydius
Sydius

Reputation: 14277

Removing duplicate rows in vi?

I have a text file that contains a long list of entries (one on each line). Some of these are duplicates, and I would like to know if it is possible (and if so, how) to remove any duplicates. I am interested in doing this from within vi/vim, if possible.

Upvotes: 182

Views: 129984

Answers (16)

horta
horta

Reputation: 1130

From here this will remove adjacent and non-adjacent duplicates without sorting:

:%!awk '\!a[$0]++'

This technically uses something outside of vim, but is called from within vim (and therefore only works in linux which has awk).

To do this entirely from within vim you can do this using a macro and the norm command to execute it on every line. On linux, this was fast, but on windows it took an oddly long time. Disabling plugins using vim -u NONE seemed to help somewhat.

qa                     # create macro in register 'a'
y$                     # yank the current line
:.+1,$g/<ctrl-r>0/d    # from the next line to the end of file, delete any pattern that matches
q                      # end of macro
:%norm! @a             # apply macro on every line in file.

Note this doesn't remove empty lines so performing

:g/^$/d

to remove any blank spaces may be useful.

Upvotes: 0

Evan
Evan

Reputation: 594

This command got me a buffer without any duplicate lines without sorting, and it shouldn't be very hard to research why it works or how it could work better:

:%!python3.11 -c 'exec("import fileinput\nLINES = []\nfor line in fileinput.input():\n    line = line.splitlines()[0]\n    if line not in LINES:\n        print(line)\n        LINES.append(line)\n")'

Upvotes: 0

John Poulis
John Poulis

Reputation: 59

If you don't want to sort/uniq the entire file, you can select the lines you want to make uniq in visual mode and then simply: :sort u.

Upvotes: 4

Sean
Sean

Reputation: 5334

Try this:

:%s/^\(.*\)\(\n\1\)\+$/\1/

It searches for any line immediately followed by one or more copies of itself, and replaces it with a single copy.

Make a copy of your file though before you try it. It's untested.

Upvotes: 42

paul
paul

Reputation: 4487

This worked for me for both .csv and .txt

awk '!seen[$0]++' <filename> > <newFileName>

Explanation: The first part of the command prints unique rows and the second part i.e. after the middle arrow is to save the output of the first part.

awk '!seen[$0]++' <filename>

>

<newFileName>

Upvotes: -1

william-1066
william-1066

Reputation: 449

An alternative method that does not use vi/vim (for very large files), is from the Linux command line use sort and uniq:

sort {file-name} | uniq -u

Upvotes: 0

SergioAraujo
SergioAraujo

Reputation: 11820

This version only removes repeated lines that are contigous. I mean, only deletes consecutive repeated lines. Using the given map the function does note mess up with blank lines. But if change the REGEX to match start of line ^ it will also remove duplicated blank lines.

" function to delete duplicate lines
function! DelDuplicatedLines()
    while getline(".") == getline(line(".") - 1)
        exec 'norm! ddk'
    endwhile
    while getline(".") == getline(line(".") + 1)
        exec 'norm! dd'
    endwhile
endfunction
nnoremap <Leader>d :g/./call DelDuplicatedLines()<CR>

Upvotes: 0

Rovin Bhandari
Rovin Bhandari

Reputation: 475

awk '!x[$0]++' yourfile.txt if you want to preserve the order (i.e., sorting is not acceptable). In order to invoke it from vim, :! can be used.

Upvotes: 15

derobert
derobert

Reputation: 51167

Select the lines in visual-line mode (Shift+v), then :!uniq. That'll only catch duplicates which come one after another.

Upvotes: 4

Chris Dodd
Chris Dodd

Reputation: 2960

I would use !}uniq, but that only works if there are no blank lines.

For every line in a file use: :1,$!uniq.

Upvotes: 0

Kevin
Kevin

Reputation: 1569

From command line just do:

sort file | uniq > file.new

Upvotes: 32

cn8341
cn8341

Reputation: 129

:%s/^\(.*\)\(\n\1\)\+$/\1/gec

or

:%s/^\(.*\)\(\n\1\)\+$/\1/ge

this is my answer for you ,it can remove multiple duplicate lines and only keep one not remove !

Upvotes: 0

Bridgey
Bridgey

Reputation: 539

g/^\(.*\)$\n\1/d

Works for me on Windows. Lines must be sorted first though.

Upvotes: 6

Luc Hermitte
Luc Hermitte

Reputation: 32966

Regarding how Uniq can be implemented in VimL, search for Uniq in a plugin I'm maintaining. You'll see various ways to implement it that were given on Vim mailing-list.

Otherwise, :sort u is indeed the way to go.

Upvotes: 1

Jon DellOro
Jon DellOro

Reputation: 533

I would combine two of the answers above:

go to head of file
sort the whole file
remove duplicate entries with uniq

1G
!Gsort
1G
!Guniq

If you were interested in seeing how many duplicate lines were removed, use control-G before and after to check on the number of lines present in your buffer.

Upvotes: 6

Brian Carper
Brian Carper

Reputation: 72946

If you're OK with sorting your file, you can use:

:sort u

Upvotes: 421

Related Questions