Reputation: 21
I'm about to build a solution to where I receive a comma separated list every night. It's a list with around 14000 rows, and I need to go through the list and select some of the values in the list. The document I receive is built up with around 50 semicolon separated values for every "case". How the document is structured:
"";"2010-10-17";"";"";"";Period-Last24h";"Problem is that the customer cant find...."; and so on, with 43 more semicolon statements. And every "case" ends with the value "Total 515";
What I need to do is go through all these "cases" and withdraw some of the values in the "cases". The "cases" is always built up in the same order and I know that it's always the 3, 15 and 45'th semicolon value that I need to withdraw.
How can I do this in the easiest way?
Upvotes: 2
Views: 777
Reputation:
A simple but slow approach would be reading single characters from the input (StringReader
class, for example). Write a ReadItem
method that reads a quote, continues to read until the next quote, and then looks for the next character. If it is a newline of semicolon, one item has been read. If it is another quote, add a single quote to the item being read. Otherwise, throw an exception. Then use this method to split up the input data into a series of items, each line stored e.g. in a string[number of items in a row]
, lines stored in a List<>
. Then you can use this class to read the CSV data inside another class that decodes the data read into objects that you can get your data out of.
Upvotes: 0
Reputation: 137118
You could use String.Split
twice.
The first time using "Total 515"; as the split string using this overload. This will give you an array of cases.
The second time using ";" as the split character using this overload on each of the cases. This will give you a data array for each case. As the data is consistent you can extract the 3rd, 15th and 45th elements of this array.
Upvotes: 1
Reputation: 273179
Assuming the "rows" are lines and that you read line by line, your main tool should be string.Split:
foreach (string line in ... )
{
string [] parts = line.split (';');
string part3 = parts[2];
string part15 = parts[14];
// etc
}
Note that this is a simple approach that will fail if the content of any column can contain ';'
Upvotes: 2
Reputation: 108790
I'd search for an existing csv library. The escaping rules are probably not that easily mapped to regex.
If writing a library myself I'd first parse each line into a list/an array of strings. And then in a second step(probably outside of the csv library itself) convert the stringlist to a strongly typed object.
Upvotes: 0
Reputation: 308743
I think you should decompose this problem into smaller problems. Here are the steps I'd take:
Don't worry about the "easiest" way. You need one way that works. Whatever you do, get something working and worry about optimizing it to make it easiest, fastest, smallest, etc. later on.
Upvotes: 2