Reputation: 2923

How to remove any illegal new lines appears within a text files using vim text editor?

I am trying to repair a data file so I can use MySQL DATA LOAD INFILE TO IMPORT DATA INTO database.

The problem that I am having with the file is that there are lengthy text but it contains a new line with in the text. Also a new line means a new record. This is making it hard for me to import the records into MySQL.

How can I use vim in linux to search for illegal new lines and replace them with a space?

Illegal new line: if a new line is found between a comma ( , ) and ( ,012d000 )

This is a sample data of the file

VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body

of this 
log test
where I need to

remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph

where I need to remove all extra

new lines

,012d0000000ppiABCD

The fields are separated by a comma ( , ) and the new record should begin when a new line \n is found. How can I do such a search replace to fix this issue?

Or how can I replace all unescaped commas with a double quotes? That is, if I find \, don't touch it, but if you find a comma with replace it with ","

Thanks

Upvotes: 0

Answers (3)

benjifisher

Reputation: 5112

I like @Peter Rincker's answer. As for the question you asked at the end, you can replace all the un-escaped commas with "," using

:%s/\\\@<!,/","/g

Here, \\ represents a literal backslash and \@<! is a modifier. (See :help /\@<!.)

The problem with this solution is that you have not correctly defined what an un-escaped comma is. For example, \\, is an escaped backslash followed by an un-escaped comma. I believe that /\\\@<!\%(\\\\\)*\zs,/ is the correct pattern, but I do not say it is pretty. It is a little better if you use the "very magic" version: /\v\\@<!%(\\\\)*\zs,/.

Upvotes: 0

Peter Rincker

Reputation: 45107

g/^VST/,-/,012d000/j!

Use the global command, :g to join together, :j, the line starting with VST with all the lines through the next instance of 012d000.

For more help see:

:h :g
:h :j
:h [range]

Upvotes: 2

Brett Y

Reputation: 7678

My regex foo isn't powerfull enough to do that in a single command but you could create a macro to achieve what you want. The following worked for the input you gave

Go to start of file

gg

Start recording

qq

Find next ,012d

/,012d<CR>

Go up one line

Enter visual mode

Go to previous comma

?,<CR>

Replace all new line chars

:s/\n//g<CR>

Go down one line

Finish recording

Repeat

@q

Result

VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body of this log test where I need to remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph where I need to remove all extra new lines ,012d0000000ppiABCD

Upvotes: 1

How to remove any illegal new lines appears within a text files using vim text editor?

Answers (3)

Related Questions