Reputation: 33
I'm new in VIM and I try to split a file in multiple files.This is a test file:
Something1;XXXword;blabla(about 500 signs);
Something1;XXXword;(about 500 signs);
Something1;YYYword;(about 500 signs);
Something1;RRRword;(about 500 signs);
XXX could be a word 2-20 characters long. When the following word (XXX/YYY/RRR) changes then before "Something1" should be a cut and the following lines till XXX changes should be an another new file and so on.
It should be so:
File1:
Something1;XXXword;blabla(about 500 signs);Something1;XXXword;(about 500 signs);
File2:
Something1;YYYword;(about 500 signs);
File3:
Something1;RRRword;(about 500 signs)
Is there a way to do this like a pro? Thanks :)
Upvotes: 0
Views: 457
Reputation: 45097
I would recommend a different tool, like Awk.
awk -F';' '{printf "%s", $0 >> $2}' your_file.txt
This will split each line into columns separated by ;
. Each line will be appended (>>
) to a file named after the 2nd column, $2
(e.g. XXXword). Append/print the whole line, $0
, except the newline (printf "%s"
) to the new file so everything is one long line.
Note: I am using gawk
as my awk
implementation, you may need to make adjustments depending on your awk
implementation.
In the following case where you had XXX
, YYY
, XXX
:
Something1;XXXword;blabla(about 500 signs);
Something1;YYYword;(about 500 signs);
Something1;XXXword;(about 500 signs);
If this should yield 3 files (1 YYY
file and 2 XXX
files) then we can use Awk as well:
awk -F';' 'last != $2 {f[$2]++} {printf "%s", $0 >> $2 f[$2]; last = $2}' your_file.txt
This will yield files: XXXword1
, XXXword2
, and YYYword1
This is the similar to the awk example above except we use a dictionary/array to store the number of times the 2nd column changes, f[$2]++
, from the previous line last != $2 {...}
. Making sure to set last
to the 2nd column after printing each line. Output the line, $0
, to a file named after $2 f[$2]
(adjacent variable and string will be concatenated).
Upvotes: 6
Reputation: 32926
You'll have to program it as you would have programmed it in any other language. My first reflex would have been Perl BTW.
function! s:split(root) abort
" todo: check empty buffers
let lines = getline(1, '$')
let nb_lines = len(lines)
let files = []
let crt = 0
while crt < nb_lines
" I suppose the word is the second field in a .csv file
let word = matchstr(lines[crt], '^[^;]*;\zs[^;]*\ze;')
" This is where the real magic happens, see :h /\@!
let next = match(lines, '^[^;]*;\(\('.word.'\)\@![^;]\)*;', crt)
if next == -1 | let next = nb_lines | endif
let files += [ lines[crt : (next-1)] ]
let crt = next
endwhile
echo files
endfunction
command! -nargs=1 SplitBuffer :call s:split("<args>")
Instead of let files += [ something ]
, you will want to execute
:let index = 0
...
:for...
...
:call writefile(a:root.index, lines[crt : (next-1)])
:let index += 1
:endfor
EDIT:
In case the sequence XXX
, YYY
, XXX
shall lead to two files instead of 3, it can be done with this (convoluted and untested) oneliner -- still, prefer @Peter Rincker's awk based solution.
:call map(getline(1, '$'), 'writefile(v:val, split(v:val, ";")[1], "a")')
Upvotes: 1