Reputation: 297
I am splitting up a comma delimited CSV-file by using this code below. It works fine, but some of the records in the file contains a comma within the data, even though comma is reserved as the delimiter. For example a record contains a description text where the user has entered commas and my script enterprets that like the record ends. How can I somehow get over that?
Here's an example of a line in my CSV-file. Please remark the empty records at the end of the lines can be witout the "" enclosing characters:
"1","34353434","Adam","Traxx","343","This man is the boss, please handle with respect","",,"","",,","","0",,"
This is how I read and split:
Sub ReadOnlineExample()
Dim line_read As String
Dim tempfilename As String
Dim i As Integer
Dim rows_skipped As Integer
Dim line_split(0 To 125) As String
Dim MortalityRates(0 To 125) As Double
tempfilename = "C:\MortalityRateTable.csv"
Dim sr As New System.IO.StreamReader(tempfilename)
'Split the line into individual data
line_split = line_read.Split(",")
'Save the split data into an array
For i = 0 To UBound(line_split)
MortalityRates(i) = line_split(i)
Next i
End Sub
Can anyone help? Maybe it's just a simple solution I just can't see :-)
Upvotes: 1
Views: 4586
Reputation: 2703
Your example is probably not correct.
Last part after sentence say: "",,"","",,","","0",,"
There is two base conditon:
x,y
, and delimiter is comma, in CSV you get "x,y"
)They say "Hey!"
in CSV you get "They say ""Hey!"""
)Other strings will be save as you type it.
And you can not transfer your last part with this rules. Try change commas to semicolon and import into Excel. Excel displays your example as follows:
In my country is standard CSV delimiter semicolon.
I write this code for parsing CSV. You can try modify it for your purposes (delimiter is variable):
public List<string> ParseCsvRow(char delimiter, string input)
{
List<string> result = new List<string>();
string step = "";
bool escaped = false;
StringBuilder stringBuilder = new StringBuilder();
int position = -1;
do
{
if (++position >= input.Length)
{
result.Add(stringBuilder.ToString());
stringBuilder = null;
break;
}
step = input.Substring(position, 1);
switch (step)
{
case "\"":
if (stringBuilder.Length == 0 && !escaped)
{
escaped = true;
continue;
}
if (position + 1 < input.Length)
step = input.Substring(++position, 1);
else
step = "";
if (step == "\"")
{
stringBuilder.Append("\"");
continue;
}
if (step.Equals(delimiter.ToString()) && escaped)
{
result.Add(stringBuilder.ToString());
stringBuilder.Clear();
escaped = false;
continue;
}
break;
default:
if (step.Equals(delimiter.ToString()) && !escaped)
{
result.Add(stringBuilder.ToString());
stringBuilder.Clear();
continue;
}
stringBuilder.Append(step);
continue;
}
} while (true);
return result;
}
Upvotes: 0
Reputation: 499002
Use a proper CSV parser - File Helpers is a popular option.
There is also a parser from Microsoft - in the Microsoft.VisualBasic.FileIO
namespace, the TextFieldParser
.
These take care of the subtle aspects of CSV that most hand rolled solutions don't consider until hitting an issue.
Upvotes: 6