Reputation: 1087
I'm trying to parse a CSV file in C#. Split on commas (,). I got it to work with this:
[\t,](?=(?:[^\"]|\"[^\"]*\")*$)
Splitting this string:
2012-01-06,"Some text with, comma",,"300,00","143,52"
Gives me:
2012-01-06
"Some text with, comma"
"300,00"
"143,52"
But I can't figure out how to lose the "" from the output so I get this instead:
2012-01-06
Some text with, comma
300,00
143,52
Any suggestions?
Upvotes: 1
Views: 7226
Reputation: 6876
If you are trying to parse a CSV and using .NET, don't use regular expressions. Use a component that was created for this purpose. See the question CSV File Imports in .Net.
I know the CSV specification looks simple enough, but trust me, you are in for heartache and destruction if you continue down this path.
Upvotes: 2
Reputation: 23796
So, something like this. Again, I wouldn't use RegEx for this purpose, but YMMV.
var sp = Regex.Split(a, "[\t,](?=(?:[^\"]|\"[^\"]*\")*$)")
.Select(s => Regex.Replace(s.Replace("\"\"","\""),"^\"|\"$","")).ToArray();
So, the idea here is that first of all, you want to replace double double quotes with a single double quote. And then that string is fed to the second regex which simply removes double quotes at the beginning and end of the string.
The reason for the first replace is because of strings like this:
var a = "1999,Chevy,\"Venture \"\"Extended Edition, Very Large\"\" Dude\",\"\",\"5000.00\"";
So, this would give you a string like this: ""Extended Edition"", and the double quotes need to be changed to single quotes.
Upvotes: 2
Reputation: 6043
Why are you using regular expressions for this? Ensuring the file is well-formed?
You can use String.Replace()
String s = "Some text with, comma";
s = s.Replace("\"", "");
// After matched
String line = 2012-01-06,"Some text with, comma",,"300,00","143,52";
String []fields = line.Split(',');
for (int i = 0; i < fields.Length; i++)
{
// Call a function to remove quotes
fields[i] = removeQuotes(fields[i]);
}
String removeQuotes(String s)
{
return s.Replace("\"", "");
}
Upvotes: 2