Reputation: 557
I have been trying to pass a csv file with three fields. The first two fields are simple and are easily extracted, the problem is with third field which is a string in nature hence can contain special characters including the ',' it self which is used to delimit the fields. I tried containing the string field between two ' " '(double quotes). But my requirement is that for simple string(without special characters) can exist without double quotes. I need to handle the next line in the string also. Below is a sample of a csv file.
123,true,This is a memo
234,false,"This is also a memo"
345,true,
456,true,Above me is a blank memo
567,false,"This has a ,
in it"
678,true,This has a , in it <--- This record should be rejected
789,false,""
890,true,Above me is also a valid blank memo
I also found a good tool for testing the regex format string at http://derekslager.com/blog/posts/2007/09/a-better-dotnet-regular-expression-tester.ashx
Till now I have used the following format string ^(""(?:[^""]|"""")""|[^,]),(""(?:[^""]|"""")""|[^,])$
The problem with this format string is that it does not handle multiple lines and does not reject a string with a starting double quote but missing ending double quote.
Thanks in advance.
Thanks for the help guys but I needed to parse custom data in CSV and had to create my own custom parser. I am parsing each and every field separately and using regex string in small chunks.
Upvotes: 0
Views: 964
Reputation: 58952
There is no need to invent this wheel again. I recommend using an existing CSV-parser, but there are many good alternatives.
I have had great success with CSVReader, it's very fast and easy to use. Basic usage:
using (CsvReader csv = new CsvReader(new StreamReader("data.csv"), true))
{
int fieldCount = csv.FieldCount;
string[] headers = csv.GetFieldHeaders();
while (csv.ReadNextRecord())
{
for (int i = 0; i < fieldCount; i++)
Console.Write(string.Format("{0} = {1};", headers[i], csv[i]));
Console.WriteLine();
}
}
Upvotes: 4