dark.vador
dark.vador

Reputation: 687

Parsing commas in field with TextFieldParser causes extra column

Following on a previous thread Object reference on array with regex.replace()

I have few columns in cpath which contain fields such as street,""test,format"", casio . Unfortunately,

csvReader.SetDelimiters(new string[] { "," });

adds empty extra columns to the DataTable csvData because of the comma between test and format. Hence the the following interrogation:

Does a trick exist to remove the comma from """test,format""" so as to get street,""test format"", casio in csvData?

Thanks in advance

EDIT

private static List<string[]> RemoveComaDataFromCSVFile(string csv_file_path)
    {
        List<string[]> allLineFields = new List<string[]>();

        try
        {
            using (TextReader sr = new StringReader(csv_file_path))
            using (TextFieldParser csvReader = new TextFieldParser(csv_file_path))
            {
                csvReader.Delimiters=new[] { "," };
                csvReader.HasFieldsEnclosedInQuotes = true;
                while (!csvReader.EndOfData)
                {

                    string[] fieldData = csvReader.ReadFields();
                    string pattern = "\"";
                    string replacement = "";
                    Regex rgx = new Regex(pattern);
                    //Making empty value as null
                    for (int i = 0; i < fieldData.Length; i++)
                    {
                        //Remove quotation marks on Fields
                        fieldData[i] = Regex.Replace(fieldData[i], pattern, replacement);
                        if (fieldData[i] == "")
                        {
                            fieldData[i] = null;
                        }
                    }
                    //csvData.Rows.Add(fieldData);
                    allLineFields.Add(fieldData);
                }
            }
        }
        catch (Exception ex)
        {
        }
        return allLineFields;
    }

Upvotes: 1

Views: 1704

Answers (1)

Tim Schmelter
Tim Schmelter

Reputation: 460148

TextFieldParser has a property HasFieldsEnclosedInQuotes which you should set to true. Then the comma in "test,format" doesn't matter because it will be interpreted as a single value.

Example:

List<string[]> allLineFields = new List<string[]>();
string sampleLine = @"street,""test,format"", casio";
using (TextReader sr = new StringReader(sampleLine))
using (TextFieldParser parser = new TextFieldParser(sr))
{
    parser.HasFieldsEnclosedInQuotes = true;
    parser.Delimiters = new[] { "," };
    while (!parser.EndOfData)
    {
        string[] fields = parser.ReadFields();
        allLineFields.Add(fields);
    }
}

Result is a single string[] with three fields:

    [0] {string[3]}     string[]
    [0] "street"        string
    [1] "test,format"   string
    [2] "casio"         string

Upvotes: 2

Related Questions