Reputation: 2923
I am trying to repair a data file so I can use MySQL DATA LOAD INFILE TO IMPORT DATA INTO database.
The problem that I am having with the file is that there are lengthy text but it contains a new line with in the text. Also a new line means a new record. This is making it hard for me to import the records into MySQL.
How can I use vim in linux to search for illegal new lines and replace them with a space?
Illegal new line: if a new line is found between a comma ( , ) and ( ,012d000 )
This is a sample data of the file
VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body
of this
log test
where I need to
remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph
where I need to remove all extra
new lines
,012d0000000ppiABCD
The fields are separated by a comma ( , ) and the new record should begin when a new line \n is found. How can I do such a search replace to fix this issue?
Or how can I replace all unescaped commas with a double quotes? That is, if I find \, don't touch it, but if you find a comma with replace it with ","
Thanks
Upvotes: 0
Views: 96
Reputation: 5112
I like @Peter Rincker's answer. As for the question you asked at the end, you can replace all the un-escaped commas with ","
using
:%s/\\\@<!,/","/g
Here, \\
represents a literal backslash and \@<!
is a modifier. (See :help /\@<!
.)
The problem with this solution is that you have not correctly defined what an un-escaped comma is. For example, \\,
is an escaped backslash followed by an un-escaped comma. I believe that /\\\@<!\%(\\\\\)*\zs,/
is the correct pattern, but I do not say it is pretty. It is a little better if you use the "very magic" version: /\v\\@<!%(\\\\)*\zs,/
.
Upvotes: 0
Reputation: 45107
g/^VST/,-/,012d000/j!
Use the global command, :g
to join together, :j
, the line starting with VST
with all the lines through the next instance of 012d000
.
For more help see:
:h :g
:h :j
:h [range]
Upvotes: 2
Reputation: 7678
My regex foo isn't powerfull enough to do that in a single command but you could create a macro to achieve what you want. The following worked for the input you gave
Go to start of file
gg
Start recording
qq
Find next ,012d
/,012d<CR>
Go up one line
k
Enter visual mode
v
Go to previous comma
?,<CR>
Replace all new line chars
:s/\n//g<CR>
Go down one line
j
Finish recording
q
Repeat
@q
Result
VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body of this log test where I need to remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph where I need to remove all extra new lines ,012d0000000ppiABCD
Upvotes: 1