JPro
JPro

Reputation: 6546

excluding a column in csv file with regex

Is there any way to exclude/delete/replace one field from a csv file with some regexp in notepad++?

I have a csv file with some data like this:

'1','data1','data2','data3','data4','data5','data6','data7','data8','data9',
'data10','data11','data12','data13','data14','data15','data16','data17','data18',
'data19','data20','data21','data22','data23','\'data24 with some commas, 
here and there and some "double quotes", and fullstops.','data25','data26'

The only problem I am facing is with data24 WHERE I encounter \' and then "" and some wild characters like , and .. This is particularly fixed at 24 field. For the purpose of clarity, I have entered a newline here. But the entire text above is in juts one line.

Any ideas on how to solve?

Thanks.

Upvotes: 2

Views: 1800

Answers (3)

Ken Bloom
Ken Bloom

Reputation: 58780

I suggest using something like Ruby's CSV library to read the file in, process it programmatically, and write it out again.

Upvotes: 0

zifot
zifot

Reputation: 2688

I'm not sure if I understand you correctly. Do you want to remove field number 24?

To get only L fields from left and R fields from right (thus, exclude fields L+1, ..., NF - R - 1, where NF is number of fields) and not to worry about weird characters in fields staying in between you can use following awk command:

awk 'BEGIN {FS=","; L=23; R=2} { for(i=1; i<=L+1; i++) printf($i); for(i=NF-R+1; i<=NF; i++) printf($i); print '\n'}' your_file

As Dave M mentioned you can get tools like cut (and awk) for Windows from here (this particular package contains gawk which should work as well with the same command)

Edit: Yeah, download link at sourceforge seems not to work. You can get awk and cut from here:

awk: http://gnuwin32.sourceforge.net/packages/gawk.htm

cut: http://gnuwin32.sourceforge.net/packages/coreutils.htm

Upvotes: 0

Sjoerd
Sjoerd

Reputation: 75619

Not reliably. It is probably easiest to change the file with some tool which knows how to handle CSV (OpenOffice).

If you still want to use a regex, take a look at the negative lookbehind, so that you match a single quote only if it is not preceded by a backslash.

Upvotes: 2

Related Questions