John Threepwood
John Threepwood

Reputation: 16143

How to handle quotation marks within CSV files?

To read a CSV file, I use the following statement:

var query = from line in rawLines
    let data = line.Split(';')
    select new
    {
    col01 = data[0],
    col02 = data[1],
    col03 = data[2]
    };

The CSV file I want to read is malformed in the way, that an entry can have the separator ; itself as data when surrounded with qutation marks.

Example:

col01;col02;col03
data01;"data02;";data03

My read statement above does not work here, since it interprets the second row as four columns.

Question: Is there an easy way to handle this malformed CSV correctly? Perhaps with another LINQ query?

Upvotes: 3

Views: 3632

Answers (3)

Matt Griffiths
Matt Griffiths

Reputation: 1142

Parsing CSV files manually can always lead to issues like this. I would advise that you use a third party tool like CsvHelper to handle the parsing.

Furthermore, it's not a good idea to explicitly parse commas, as your separator can be overridden in your computers environment options.

Let me know if I can help further,

Matt

Upvotes: 1

Darin Dimitrov
Darin Dimitrov

Reputation: 1038830

Just use a CSV parser and STOP ROLLING YOUR OWN:

using (var parser = new TextFieldParser("test.csv"))
{
    parser.CommentTokens = new string[] { "#" };
    parser.SetDelimiters(new string[] { ";" });
    parser.HasFieldsEnclosedInQuotes = true;

    // Skip over header line.
    parser.ReadLine();

    while (!parser.EndOfData)
    {
        string[] fields = parser.ReadFields();
        Console.WriteLine("{0} {1} {2}", fields[0], fields[1], fields[2]);
    }
}

TextFieldParser is built in .NET. Just add reference to the Microsoft.VisualBasic assembly and you are good to go. A real CSV parser will happily handle this situation.

Upvotes: 12

Ignacio Soler Garcia
Ignacio Soler Garcia

Reputation: 21855

Not very elegant but after using your method you can check if any colxx contains an unfinished quotation mark (single) you can join it with the next colxx.

Upvotes: 0

Related Questions