sboga
sboga

Reputation: 51

java regex to remove unwanted double quotes in csv

I have a csv file that has the following line. as you can see numbers are NOT enclosed in double quotes.

String theLine = "Corp:Industrial","5Nearest",51.93000000,"10:21:29","","","","10:21:29","7/5/2016","PER PHONE CALL WITH SAP, CORRECTING "C","359/317 97 SMRD 96.961 MADV",""

I try to read the above line and split using the regEX

String[] tokens = theLine.split(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");

this doesn't split at every comma like I want it. "PER PHONE CALL WITH SAP, CORRECTING "C", is messing it up because it has additional ,(comma) and " (double quote). can some one please help me write a regex that will escape a additional double quote and a comma with in two double quotes.

I basically want :

"Corp:Industrial","5Nearest",51.93000000,"10:21:29","","","","10:21:29","7/5/2016","**PER PHONE CALL WITH SAP CORRECTING C**","359/317 97 SMRD 96.961 MADV",""

Upvotes: 1

Views: 1172

Answers (1)

Geoffrey Wiseman
Geoffrey Wiseman

Reputation: 5637

There are jobs that parsers are much better at than Regular Expressions, and this sort of thing is typically one of them. I'm not saying you can't make it work for you, but ... there are also open-source CSV Parsers you could press into service.

Having said that, your CSV looks suspect to me.

"PER PHONE CALL WITH SAP, CORRECTING "C",

That value has three quotes in it -- is it meant to represent a string with only a single quote inside? Or should the C be surrounded by quotes as well as the String?

Normally if you're going to include a double quote inside a double quote you need a special syntax for it. For CSV, the most common options would be doubling it, or escaping it with a character like a backslash:

"PER PHONE CALL WITH SAP, CORRECTING ""C""",

Or:

"PER PHONE CALL WITH SAP, CORRECTING \"C\"",

None of which will directly change your problem of using Regular Expressions, but once you have well-formed CSV, your odds of parsing it successfully go up.

Upvotes: 2

Related Questions