Reputation: 420
Using C#, I need to parse a CSV string that doesn't come from a file. I've found a great deal of material on parsing CSV files, but virtually nothing on strings. It seems as though this should be simple, yet thus far I can come up only with inefficient methods, such as this:
using Microsoft.VisualBasic.FileIO;
var csvParser = new TextFieldParser(new StringReader(strCsvLine));
csvParser.SetDelimiters(new string[] { "," });
csvParser.HasFieldsEnclosedInQuotes = true;
Are there good ways of making this more efficient and less ugly? I will be processing huge volumes of strings, so I wouldn't want to pay the cost of all the above. Thanks.
Upvotes: 0
Views: 225
Reputation: 51
Here is a lightly tested parser that handles quotes
public List<string> Parse(string line)
{
var columns = new List<string>();
var sb = new StringBuilder();
bool isQuoted = false;
for (int i = 0; i < line.Length; i++)
{
char c = line[i];
// If the current character is a double quote
if (c == '"')
{
// If we're not inside a quoted section, set isQuoted to true
if (!isQuoted && sb.Length == 0)
{
isQuoted = true;
}
else if (isQuoted && i + 1 < line.Length && line[i + 1] == '"') // Check for escaped double quotes
{
sb.Append('"');
i++; // Skip the next quote
}
else if (isQuoted) // If the next character is not a double quote, set isQuoted to false
{
isQuoted = false;
}
else // Not a quoted string
{
sb.Append('"');
}
continue;
}
// If the current character is a comma and we're not inside a quoted section, add the column and clear the StringBuilder
if (!isQuoted && c == ',')
{
columns.Add(sb.ToString());
sb.Clear();
continue;
}
// Append the character to the current column
sb.Append(c);
}
// Add the last column
columns.Add(sb.ToString());
return columns;
}
Upvotes: 3