Mehdi Hadjar
Mehdi Hadjar

Reputation: 675

Import data from CSV file with comma in string cells

I want to import data from a CSV file, But some cells contain comma in string value. How can I recognize which comma is for separate and which is in cell content?

Upvotes: 1

Views: 5861

Answers (5)

Husen
Husen

Reputation: 1587

We can use RegEx also as bellow.

Regex CSVParser = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
String[] Fields = CSVParser.Split(Test);

Upvotes: 0

ATHER
ATHER

Reputation: 3384

using FileHelper is defnitley way to go. They have done a great job building all the logic for you. I had the same issue where i had to parse a CSV file having comma as part of the field. And this utility did the job very well. All you have to do is to use fillowing attribute on to the field

    [FieldQuoted('"', QuoteMode.OptionalForBoth)]

For details http://www.filehelpers.com/forums/viewtopic.php?f=12&t=391

Upvotes: 0

Pleun
Pleun

Reputation: 8920

In general, do not bother writing the import yourself.

I have good experiences with the FileHelpers lib. http://www.filehelpers.com/

And indeed, I hope your fields are quoted. Filehelpers supports this out of the box.

Otherwise there is not much you can do.

Upvotes: 1

BLUEPIXY
BLUEPIXY

Reputation: 40145

use TextFieldParser :usage

using Microsoft.VisualBasic.FileIO; //Microsoft.VisualBasic.dll
...
using(var csvReader = new TextFieldParser(reader)){
    csvReader.SetDelimiters(new string[] {","});
    csvReader.HasFieldsEnclosedInQuotes = true;
    fields = csvReader.ReadFields();
}

Upvotes: 6

FlyingGuy
FlyingGuy

Reputation: 333

Unless you have quotes around the strings you are pretty much hosed, hence the "quote and comma" delimiter style. If you have control of the export facility then you must select "enclose strings quotes" or change the delimiter to something like a tilde or carat symbol.

If not well then you have to write some code. If you detect "a..z" then start counting commas and then keep working through string until you detect [0..9] and even then this is going to be problematic since people can put a [0..9] in their text. At best this is going to be a best efforts process. Your going to have to know when you are in chars and when you are not going to be in chars. I doubt even regex will help you much on this.

The only other thing I can think of is to run through the data and look for commas. Then look prior to and after the comma. If you are surrounded by chars then replace the comma with alternate char like the carat "^" symbol or the tilde "~". Then process the file as normal then go back and replace the alternate char with a comma.

Good luck.

Upvotes: 0

Related Questions