MAC
MAC

Reputation: 6577

Splitting Comma Separated Values (CSV)

How to split the CSV file in c sharp? And how to display this?

Upvotes: 2

Views: 7868

Answers (8)

Michael Hutter
Michael Hutter

Reputation: 1542

Problem: Convert a comma separated string into an array where commas in "quoted strings,,," should not be considered as separators but as part of an entry

Input: String: First,"Second","Even,With,Commas",,Normal,"Sentence,with ""different"" problems",3,4,5

Output: String-Array: ['First','Second','Even,With,Commas','','Normal','Sentence,with "different" problems','3','4','5']

Code:

string sLine;
sLine = "First,\"Second\",\"Even,With,Commas\",,Normal,\"Sentence,with \"\"different\"\" problems\",3,4,5";

// 1. Split line by separator; do not split if separator is within quotes
string Separator = ",";
string Escape = '"'.ToString();
MatchCollection Matches = Regex.Matches(sLine,
    string.Format("({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*", Separator, Escape));
string[] asColumns = new string[Matches.Count + 1];

int PriorMatchIndex = 0;
for (int Index = 0; Index <= asColumns.Length - 2; Index++)
{
    asColumns[Index] = sLine.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
    PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
}
asColumns[asColumns.Length - 1] = sLine.Substring(PriorMatchIndex);

// 2. Remove quotes
for (int Index = 0; Index <= asColumns.Length - 1; Index++)
{
    if (Regex.IsMatch(asColumns[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) // If "Text" is sourrounded by quotes (but ignore double quotes => "Leave ""inside"" quotes")
    {
        asColumns[Index] = asColumns[Index].Substring(1, asColumns[Index].Length - 2); // "Text" => Text
    }
    asColumns[Index] = asColumns[Index].Replace(Escape + Escape, Escape); // Remove double quotes ('My ""special"" text' => 'My "special" text')
    if (asColumns[Index] == null) asColumns[Index] = "";
}

The output array is asColumns

Upvotes: 0

MAC
MAC

Reputation: 6577

I had got the result for my query. its like simple like i had read a file using io.file. and all the text are stored into a string. After that i splitted with a seperator. The code is shown below.

using System;
using System.Collections.Generic;
using System.Text;

namespace CSV
{
    class Program
    {
        static void Main(string[] args)
        {

            string csv = "user1, user2, user3,user4,user5";

            string[] split = csv.Split(new char[] {',',' '});
            foreach(string s in split)
            {
                if (s.Trim() != "")
                    Console.WriteLine(s);
            }
            Console.ReadLine();
        }
    }
}

Upvotes: 0

cjk
cjk

Reputation: 46485

Import Micorosoft.VisualBasic as a reference (I know, its not that bad) and use Microsoft.VisualBasic.FileIO.TextFieldParser - this handles CSV files very well, and can be used in any .Net language.

Upvotes: 3

Mark
Mark

Reputation: 11

The following function takes a line from a CSV file and splits it into a List<string>.

Arguments:
string line = the line to split
string textQualifier = what (if any) text qualifier (i.e. "" or "\"" or "'")
char delim = the field delimiter (i.e. ',' or ';' or '|' or '\t')
int colCount = the expected number of fields (0 means don't check)

Example usage:

List<string> fields = SplitLine(line, "\"", ',', 5);
// or
List<string> fields = SplitLine(line, "'", '|', 10);
// or
List<string> fields = SplitLine(line, "", '\t', 0);

Function:

private List<string> SplitLine(string line, string textQualifier, char delim, int colCount)
{
    List<string> fields = new List<string>();
    string origLine = line;

    char textQual = '"';
    bool hasTextQual = false;
    if (!String.IsNullOrEmpty(textQualifier))
    {
        hasTextQual = true;
        textQual = textQualifier[0];            
    }

    if (hasTextQual)
    {
        while (!String.IsNullOrEmpty(line))
        {
            if (line[0] == textQual) // field is text qualified so look for next unqualified delimiter
            {
                int fieldLen = 1;
                while (true)
                {
                    if (line.Length == 2) // must be final field (zero length)
                    {
                        fieldLen = 2;
                        break;
                    }
                    else if (fieldLen + 1 >= line.Length) // must be final field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == textQual) // escaped text qualifier
                    {
                        fieldLen += 2;
                    }
                    else if (line[fieldLen] == textQual && line[fieldLen + 1] == delim) // must be end of field
                    {
                        fieldLen += 1;
                        break;
                    }
                    else // not a delimiter
                    {
                        fieldLen += 1;
                    }
                }
                string escapedQual = textQual.ToString() + textQual.ToString();
                fields.Add(line.Substring(1, fieldLen - 2).Replace(escapedQual, textQual.ToString())); // replace escaped qualifiers
                if (line.Length >= fieldLen + 1)
                {
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // blank final field
                    {
                        fields.Add("");
                    }
                }
                else
                {
                    line = "";
                }
            }
            else // field is not text qualified
            {
                int fieldLen = line.IndexOf(delim);
                if (fieldLen != -1) // check next delimiter position
                {
                    fields.Add(line.Substring(0, fieldLen));
                    line = line.Substring(fieldLen + 1);
                    if (line == "") // final field must be blank 
                    {
                        fields.Add("");
                    }
                }
                else // must be last field
                {
                    fields.Add(line);
                    line = "";
                }
            }
        }
    }
    else // if there is no text qualifier, then use existing split function
    {
        fields.AddRange(line.Split(delim));
    }      

    if (colCount > 0 && colCount != fields.Count) // count doesn't match expected so throw exception
    {
        throw new Exception("Field count was:" + fields.Count.ToString() + ", expected:" + colCount.ToString() + ". Line:" + origLine);

    }
    return fields;
}

Upvotes: 0

Graham
Graham

Reputation: 15244

I've been using the TextFieldParser Class in the Microsoft.VisualBasic.FileIO namespace for a C# project I'm working on. It will handle complications such as embedded commas or fields that are enclosed in quotes etc. It returns a string[] and, in addition to CSV files, can also be used for parsing just about any type of structured text file.

Upvotes: 12

Spencer Ruport
Spencer Ruport

Reputation: 35117

This is a CSV parser I use on occasion.

Usage: (dgvMyView is a datagrid type.)

CSVReader reader = new CSVReader("C:\MyFile.txt");
reader.DisplayResults(dgvMyView);

Class:

using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Forms;    
public class CSVReader
{
    private const string ESCAPE_SPLIT_REGEX = "({1}[^{1}]*{1})*(?<Separator>{0})({1}[^{1}]*{1})*";
    private string[] FieldNames;
    private List<string[]> Records;
    private int ReadIndex;

    public CSVReader(string File)
    {
        Records = new List<string[]>();
        string[] Record = null;
        StreamReader Reader = new StreamReader(File);
        int Index = 0;
        bool BlankRecord = true;

        FieldNames = GetEscapedSVs(Reader.ReadLine());
        while (!Reader.EndOfStream)
        {
            Record = GetEscapedSVs(Reader.ReadLine());
            BlankRecord = true;
            for (Index = 0; Index <= Record.Length - 1; Index++)
            {
                if (!string.IsNullOrEmpty(Record[Index])) BlankRecord = false;
            }
            if (!BlankRecord) Records.Add(Record);
        }
        ReadIndex = -1;
        Reader.Close();
    }

    private string[] GetEscapedSVs(string Data)
    {
        return GetEscapedSVs(Data, ",", "\"");
    }
    private string[] GetEscapedSVs(string Data, string Separator, string Escape)
    {
        string[] Result = null;
        int Index = 0;
        int PriorMatchIndex = 0;
        MatchCollection Matches = Regex.Matches(Data, string.Format(ESCAPE_SPLIT_REGEX, Separator, Escape));

        Result = new string[Matches.Count];


        for (Index = 0; Index <= Result.Length - 2; Index++)
        {
            Result[Index] = Data.Substring(PriorMatchIndex, Matches[Index].Groups["Separator"].Index - PriorMatchIndex);
            PriorMatchIndex = Matches[Index].Groups["Separator"].Index + Separator.Length;
        }
        Result[Result.Length - 1] = Data.Substring(PriorMatchIndex);

        for (Index = 0; Index <= Result.Length - 1; Index++)
        {
            if (Regex.IsMatch(Result[Index], string.Format("^{0}[^{0}].*[^{0}]{0}$", Escape))) Result[Index] = Result[Index].Substring(1, Result[Index].Length - 2);
            Result[Index] = Result[Index].Replace(Escape + Escape, Escape);
            if (Result[Index] == null) Result[Index] = "";
        }

        return Result;
    }

    public int FieldCount
    {
        get { return FieldNames.Length; }
    }

    public string GetString(int Index)
    {
        return Records[ReadIndex][Index];
    }

    public string GetName(int Index)
    {
        return FieldNames[Index];
    }

    public bool Read()
    {
        ReadIndex = ReadIndex + 1;
        return ReadIndex < Records.Count;
    }


    public void DisplayResults(DataGridView DataView)
    {
        DataGridViewColumn col = default(DataGridViewColumn);
        DataGridViewRow row = default(DataGridViewRow);
        DataGridViewCell cell = default(DataGridViewCell);
        DataGridViewColumnHeaderCell header = default(DataGridViewColumnHeaderCell);
        int Index = 0;
        ReadIndex = -1;

        DataView.Rows.Clear();
        DataView.Columns.Clear();

        for (Index = 0; Index <= FieldCount - 1; Index++)
        {
            col = new DataGridViewColumn();
            col.CellTemplate = new DataGridViewTextBoxCell();
            header = new DataGridViewColumnHeaderCell();
            header.Value = GetName(Index);
            col.HeaderCell = header;
            DataView.Columns.Add(col);
        }

        while (Read())
        {
            row = new DataGridViewRow();
            for (Index = 0; Index <= FieldCount - 1; Index++)
            {
                cell = new DataGridViewTextBoxCell();
                cell.Value = GetString(Index).ToString();
                row.Cells.Add(cell);
            }
            DataView.Rows.Add(row);
        }
    }
}

Upvotes: 1

Vinko Vrsalovic
Vinko Vrsalovic

Reputation: 340496

Display where? About splitting, the best way is to use a good library to that effect.

This library is pretty good, I can recommend it heartily.

The problems using naïve methods is that the usually fail, there are tons of considerations without even thinking about performance:

  • What if the text contains commas
  • Support for the many existing formats (separated by semicolon, or text surrounded by quotes, or single quotes, etc.)
  • and many others

Upvotes: 5

Nippysaurus
Nippysaurus

Reputation: 20378

read the file one line at a time, then ...

foreach (String line in line.Split(new char[] { ',' }))
    Console.WriteLine(line);

Upvotes: 1

Related Questions