Reputation: 233
I am trying to parse a CSV file with commas in the data. All fields do not have quotes around them, but SOME of the dollar amounts do when there is a comma in them. So for example the file looks something like this:
0000-0C,Sales-Data Entry,10/31/2016,000000,VJ,Comapny,,312.3
0000-0C,Sales-Data Entry,10/31/2016,000000,VJ,Company,,"1,420.97"
So as you can see there is a comma in the 1,420.97 but it does have quotes around this. Is there a way to reliably read this file?
Here is my code:
var path = @"glid.csv";
TextFieldParser parser = new TextFieldParser(path);
parser.HasFieldsEnclosedInQuotes = true;
parser.SetDelimiters(",");
while (!parser.EndOfData)
{
parser.ReadLine();
fields = parser.ReadFields();
//do something
}
parser.Close();
Upvotes: 0
Views: 7515
Reputation: 21864
CSV is a rebellious bird, that none can tame and therefore NEVER try to parse it with own rules.
I strongly recommend to use the OleDbDataAdapter
to fill a Dataset instead of the Microsoft.VisualBasic.FileIO.TextFieldParser
, this saves you a lot of time and headaches.
string fileName = @"glid.csv";
DataSet ds = new DataSet("csvData");
string dir = Path.GetDirectoryName(fileName);
string connstr = String.Format("Provider = Microsoft.Jet.OleDb.4.0; Data Source={0}; Extended Properties = \"Text;HDR=YES;FMT=Delimited\"",dir);
using (System.Data.OleDb.OleDbConnection conn = new System.Data.OleDb.OleDbConnection())
{
conn.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM " + Path.GetFileName(fileName), conn);
adapter.Fill(ds);
}
var p = ds; //<-- here is your data;
Upvotes: 3
Reputation: 4846
In all honesty, I'd reccommend using this nuget package. It's really helpful in both reading and writing csv and even allows you to directly map onto a class creating an object model from your records. It will handle all of the escaping and complicated stuff like that for you. Put simply, it just works.
It's really good stuff and very easy to use!
Upvotes: 0
Reputation: 514
Are you sure you want to use both
parser.ReadLine();
and
fields = parser.ReadFields();
As per the documentation https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser(v=vs.110).aspx this will skip a line, process the next line over and over until the end of the file.
You may also wish to check your encoding type, TextFieldParser constructor accepts an Encoding type: https://msdn.microsoft.com/en-us/library/ms128085(v=vs.110).aspx
try the following encodings: https://msdn.microsoft.com/en-us/library/system.text.encoding(v=vs.110).aspx
Upvotes: 3