nastassiar
nastassiar

Reputation: 1621

Parsing CSV File with \" in C#

I'm using VB's TextField in C# to parse a CSV file. But I am getting an error when it gets to \"

using (TextFieldParser csvReader = new TextFieldParser(csvFilePath)) {
    csvReader.SetDelimiters(new string[] { "," });
    csvReader.HasFieldsEnclosedInQuotes = true;
    string[] colFields = csvReader.ReadFields();
    foreach (string column in colFields)
    {
        DataColumn datacolumn = new DataColumn(column);
        datacolumn.AllowDBNull = true;
        csvData.Columns.Add(datacolumn);
    }
    while (!csvReader.EndOfData)
    {
        string[] fieldData = csvReader.ReadFields();
        for (int i = 0; i < fieldData.Length; i++)
        {
            if (fieldData[i] == "")
            {
                fieldData[i] = null;
            }
        }
        csvData.Rows.Add(fieldData);
    }
}

And this is the line in the csv that is causing the error:

"101","Brake System","Level should be between \"MIN\" and \"MAX\" marks."

I don't know how to deal with the \" in C# using TextFieldParser

Upvotes: 0

Views: 1718

Answers (3)

JNYRanger
JNYRanger

Reputation: 7097

Here's how to do it using two different methods without using TextFieldParser. TextFieldParser is very slow and not recommended for use in an actual production application.

Here's the simpler method using just String methods, and assuming that it's delimited with , without any quotes or any other special CSV formatting.

FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
{
    for(String line = reader.ReadLine(); line != null; line = reader.ReadLine())
    {
         string[] fields = line.Split(new[] {','});
         foreach(String f in fields)
         {
             //do whatever you need for each field
          }
     }
 }

Now if you want to use CsvHelper (available on nuget) becaues you have a more complicated CSV file with things like quoted field, headers, or if the rows of your CSV can map directly to an object that you have then this library might help you.

Not Mapped Example:

FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
using (CsvReader csv = new CsvReader(reader))
{
     csv.Configuration.Delimiter = ",";
     csv.Configuration.HasHeaderRecord = false;
     csv.Configuration.IgnoreQuotes = true; //if you don't use field quoting
     csv.Configuration.TrimFields = true; //trim fields as you read them
     csv.Configuration.WillThrowOnMissingField = false; //otherwise null fields aren't allowed

     while(csv.Read())
     {
          myStringVar = csv.GetField<string>(0); //gets first field as string
          myIntVar = csv.GetField<int>(1); //gets second field as int
          ... //etc, you get the idea

     }
}

Mapped Example:

Mapping Class- Assumes you have a class named MyClass with the fields named field1, field2, field3

public sealed class MyClassMap : CsvClassMap<MyClass>
{
     public MyClassMap()
     {
         Map(m => m.field1).Index(0);
         Map(m => m.field2).Index(1);
         Map(m => m.field3).Index(2);
     }
 }

Parsing Code

FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
using (CsvReader csv = new CsvReader(reader))
{
     csv.Configuration.Delimiter = ",";
     csv.Configuration.HasHeaderRecord = false;
     csv.Configuration.IgnoreQuotes = true; //if you don't use field quoting
     csv.Configuration.TrimFields = true; //trim fields as you read them
     csv.Configuration.WillThrowOnMissingField = false; //otherwise null fields aren't allowed
     csv.Configuration.RegisterClassMap<MyClassMap>(); //adds our mapping class to the reader

     while(csv.Read())
     {
         myObject = csv.GetRecord<MyClass>();
         //do whatever here
     }
}

Both of these methods won't care that you have any strange characters like \ in your csv file.

Disclaimer: I have no relation to CsvHelper, but have had success with it in a few projects in the past in which it has made my life much easier

Upvotes: 0

Cory Nelson
Cory Nelson

Reputation: 29991

If you don't mind using a different library, Ctl.Data has a mode (parseMidQuotes: true) specifically to allow parsing broken CSV like this.

using (StreamReader sr = new StreamReader("data.csv"))
{
    var reader = new CsvReader<Record>(sr, parseMidQuotes: true, readHeader: false);

    while (reader.Read())
    {
        Record rec = reader.CurrentObject.Value;
        rec.Description = rec.Description.Replace("\\\"", "\"");

        // use record...
    }
}

And define your Record object:

(Normally it would match the header of the file to the properties, but in your case with a headerless file you need to specify the order with the Column attribute.

class Record
{
    [Column(Order = 0)]
    public int Id { get; set; }

    [Column(Order = 1)]
    public string Category { get; set; }

    [Column(Order = 2)]
    public string Description { get; set; }
}

(Disclaimer: I'm the author of said library)

Upvotes: 0

Andrew Morton
Andrew Morton

Reputation: 25023

If the csv file will fit into memory, you could read it in, replace each \" with "", and use a MemoryStream as the input to the the TextFieldParser:

string data = File.ReadAllText(@"C:\temp\csvdata.txt").Replace("\\\"", "\"\"");

//TODO: Use the correct Encoding.
using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(data)))
{
    using (TextFieldParser csvReader = new TextFieldParser(ms))
    {
        csvReader.SetDelimiters(new string[] { "," });
        csvReader.HasFieldsEnclosedInQuotes = true;
        string[] colFields = csvReader.ReadFields();
        foreach (string s in colFields)
        {
            Console.WriteLine(s);
        }
    }
}

Which, for your example data, outputs

101
Brake System
Level should be between "MIN" and "MAX" marks.

Upvotes: 2

Related Questions