MHeads
MHeads

Reputation: 387

Extra quotes when converting .csv to datatable

I need to fill two datatables with two excel files. The files can be a Comma-separated values file (.csv) or it can be an Excel document (.xlsx) file. I use a function to convert my .csv to a datatable :


public DataTable GetDataTableFromCsv(string path)
{
    DataTable dataTable = new DataTable();
    String[] values;

    values = File.ReadAllLines(path);

    string[] csvRows = System.IO.File.ReadAllLines(path);
    string[] headers = csvRows[0].Split(',');

    // Adding columns name
    foreach (var item in headers)
        dataTable.Columns.Add(new DataColumn(item));

    string[] fields = null;

    foreach (string csvRow in csvRows)
    {
        fields = csvRow.Split(',');
        DataRow row = dataTable.NewRow();
        row.ItemArray = fields;
        dataTable.Rows.Add(row);
    }
    return dataTable;
}

The problem occurs when I try to compare my two datatables with :

ieDiff = dt1.AsEnumerable().Except(dt2.AsEnumerable(), DataRowComparer.Default);

My first datatable obtained by a .xlsx file contains the correct values ​​and my second datatable obtained by a .csv file contains the correct values​​, but with some extra quotes.

ex :

dt1.Rows[10] (datatable from .csv) will contains :

  1. Item1
  2. Item2
  3. "."
  4. "Item4"

dt2.Rows[10] (datatable from .xlsx) will contains :

  1. Item1
  2. Item2
  3. .
  4. Item4

I know I can trim all of these, but there is certainly a better way to perform this?

Upvotes: 2

Views: 2877

Answers (2)

WonderWorker
WonderWorker

Reputation: 9062

I have written five methods below that will turn a Csv file into a DataTable.

They have been designed to take into account optional quote marks (e.g. " symbols) and to be as versatile as possible without using other libraries:

    public static DataTable GetDataTabletFromCSVFile(string filePath, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            using (TextFieldParser TextFieldParser = new TextFieldParser(filePath))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    public static DataTable GetDataTableFromCsvString(string csvBody, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            MemoryStream MemoryStream = new MemoryStream();


            StreamWriter StreamWriter = new StreamWriter(MemoryStream);

            StreamWriter.Write(csvBody);

            StreamWriter.Flush();


            MemoryStream.Position = 0;


            using (TextFieldParser TextFieldParser = new TextFieldParser(MemoryStream))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    public static DataTable GetDataTableFromRemoteCsv(string url, bool isHeadings)
    {
        DataTable MethodResult = null;
        try
        {
            HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(url);
            HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();

            StreamReader StreamReader = new StreamReader(httpWebResponse.GetResponseStream());

            using (TextFieldParser TextFieldParser = new TextFieldParser(StreamReader))
            {
                if (isHeadings)
                {
                    MethodResult = GetDataTableFromTextFieldParser(TextFieldParser);

                }
                else
                {
                    MethodResult = GetDataTableFromTextFieldParserNoHeadings(TextFieldParser);

                }

            }

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }


    private static DataTable GetDataTableFromTextFieldParser(TextFieldParser textFieldParser)
    {
        DataTable MethodResult = null;
        try
        {
            textFieldParser.SetDelimiters(new string[] { "," });

            textFieldParser.HasFieldsEnclosedInQuotes = true;


            string[] ColumnFields = textFieldParser.ReadFields();

            DataTable dt = new DataTable();

            foreach (string ColumnField in ColumnFields)
            {
                DataColumn DataColumn = new DataColumn(ColumnField);

                DataColumn.AllowDBNull = true;

                dt.Columns.Add(DataColumn);

            }


            while (!textFieldParser.EndOfData)
            {
                string[] Fields = textFieldParser.ReadFields();


                for (int i = 0; i < Fields.Length; i++)
                {
                    if (Fields[i] == "")
                    {
                        Fields[i] = null;

                    }

                }

                dt.Rows.Add(Fields);

            }

            MethodResult = dt;

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

    private static DataTable GetDataTableFromTextFieldParserNoHeadings(TextFieldParser textFieldParser)
    {
        DataTable MethodResult = null;
        try
        {
            textFieldParser.SetDelimiters(new string[] { "," });

            textFieldParser.HasFieldsEnclosedInQuotes = true;

            bool FirstPass = true;

            DataTable dt = new DataTable();

            while (!textFieldParser.EndOfData)
            {
                string[] Fields = textFieldParser.ReadFields();

                if(FirstPass)
                {
                    for (int i = 0; i < Fields.Length; i++)
                    {
                        DataColumn DataColumn = new DataColumn("Column " + i);

                        DataColumn.AllowDBNull = true;

                        dt.Columns.Add(DataColumn);

                    }

                    FirstPass = false;

                }

                for (int i = 0; i < Fields.Length; i++)
                {
                    if (Fields[i] == "")
                    {
                        Fields[i] = null;

                    }

                }

                dt.Rows.Add(Fields);

            }

            MethodResult = dt;

        }
        catch (Exception ex)
        {
            ex.HandleException();
        }
        return MethodResult;
    }

If, like me, you're saving from reporting services then you should use it like this:

    Warning[] warnings;
    string[] streamids;
    string mimeType;
    string encoding;
    string filenameExtension;

    byte[] bytes = rvMain.ServerReport.Render("csv", null, out mimeType, out encoding, out filenameExtension, out streamids, out warnings);

    string CsvBody = System.Text.Encoding.UTF8.GetString(bytes);

    DataTable dt = GetDataTableFromCsvString(CsvBody,true);

Otherwise, all you need do is:

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTableFromCsvString(CsvBody, IsHeadings);

Or to use directly from a csv file

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTabletFromCsvFile(FilePath, IsHeadings)

Or to use a csv file that is stored remotely

    bool IsHeadings = true; //Does the data include a heading row?

    DataTable dt = GetDataTabletFromRemoteCsv(Url, IsHeadings)

Upvotes: 0

Charles380
Charles380

Reputation: 1279

The easiest answer is one you mentioned in the question which is to trim the quote character.

My test code:

var test = new[] { "Item1", "Item2", "\".\"", "\"Item4\"" };
foreach (var s in test)
{
    textBox1.AppendTextAddNewLine(s.Trim('"'));
}

Resulting Output:

Item1
Item2
.
Item4

Upvotes: 1

Related Questions