Reputation: 1621
I'm using VB's TextField in C# to parse a CSV file. But I am getting an error when it gets to \"
using (TextFieldParser csvReader = new TextFieldParser(csvFilePath)) {
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datacolumn = new DataColumn(column);
datacolumn.AllowDBNull = true;
csvData.Columns.Add(datacolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
And this is the line in the csv that is causing the error:
"101","Brake System","Level should be between \"MIN\" and \"MAX\" marks."
I don't know how to deal with the \" in C# using TextFieldParser
Upvotes: 0
Views: 1718
Reputation: 7097
Here's how to do it using two different methods without using TextFieldParser
. TextFieldParser
is very slow and not recommended for use in an actual production application.
Here's the simpler method using just String
methods, and assuming that it's delimited with ,
without any quotes or any other special CSV formatting.
FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
{
for(String line = reader.ReadLine(); line != null; line = reader.ReadLine())
{
string[] fields = line.Split(new[] {','});
foreach(String f in fields)
{
//do whatever you need for each field
}
}
}
Now if you want to use CsvHelper (available on nuget) becaues you have a more complicated CSV file with things like quoted field, headers, or if the rows of your CSV can map directly to an object that you have then this library might help you.
Not Mapped Example:
FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
using (CsvReader csv = new CsvReader(reader))
{
csv.Configuration.Delimiter = ",";
csv.Configuration.HasHeaderRecord = false;
csv.Configuration.IgnoreQuotes = true; //if you don't use field quoting
csv.Configuration.TrimFields = true; //trim fields as you read them
csv.Configuration.WillThrowOnMissingField = false; //otherwise null fields aren't allowed
while(csv.Read())
{
myStringVar = csv.GetField<string>(0); //gets first field as string
myIntVar = csv.GetField<int>(1); //gets second field as int
... //etc, you get the idea
}
}
Mapped Example:
Mapping Class- Assumes you have a class named MyClass
with the fields named field1, field2, field3
public sealed class MyClassMap : CsvClassMap<MyClass>
{
public MyClassMap()
{
Map(m => m.field1).Index(0);
Map(m => m.field2).Index(1);
Map(m => m.field3).Index(2);
}
}
Parsing Code
FileInfo file = new FileInfo("myfile.csv");
using (TextReader reader = file.OpenText())
using (CsvReader csv = new CsvReader(reader))
{
csv.Configuration.Delimiter = ",";
csv.Configuration.HasHeaderRecord = false;
csv.Configuration.IgnoreQuotes = true; //if you don't use field quoting
csv.Configuration.TrimFields = true; //trim fields as you read them
csv.Configuration.WillThrowOnMissingField = false; //otherwise null fields aren't allowed
csv.Configuration.RegisterClassMap<MyClassMap>(); //adds our mapping class to the reader
while(csv.Read())
{
myObject = csv.GetRecord<MyClass>();
//do whatever here
}
}
Both of these methods won't care that you have any strange characters like \
in your csv file.
Disclaimer: I have no relation to CsvHelper, but have had success with it in a few projects in the past in which it has made my life much easier
Upvotes: 0
Reputation: 29991
If you don't mind using a different library, Ctl.Data has a mode (parseMidQuotes: true
) specifically to allow parsing broken CSV like this.
using (StreamReader sr = new StreamReader("data.csv"))
{
var reader = new CsvReader<Record>(sr, parseMidQuotes: true, readHeader: false);
while (reader.Read())
{
Record rec = reader.CurrentObject.Value;
rec.Description = rec.Description.Replace("\\\"", "\"");
// use record...
}
}
And define your Record
object:
(Normally it would match the header of the file to the properties, but in your case with a headerless file you need to specify the order with the Column
attribute.
class Record
{
[Column(Order = 0)]
public int Id { get; set; }
[Column(Order = 1)]
public string Category { get; set; }
[Column(Order = 2)]
public string Description { get; set; }
}
(Disclaimer: I'm the author of said library)
Upvotes: 0
Reputation: 25023
If the csv file will fit into memory, you could read it in, replace each \"
with ""
, and use a MemoryStream as the input to the the TextFieldParser:
string data = File.ReadAllText(@"C:\temp\csvdata.txt").Replace("\\\"", "\"\"");
//TODO: Use the correct Encoding.
using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(data)))
{
using (TextFieldParser csvReader = new TextFieldParser(ms))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string s in colFields)
{
Console.WriteLine(s);
}
}
}
Which, for your example data, outputs
101
Brake System
Level should be between "MIN" and "MAX" marks.
Upvotes: 2