Reputation: 1806
I have been given the task of deserializing some data. The data has all been munged into a string which is in the following format:
InternalNameA8ValueDisplay NameA¬InternalNameB8ValueDisplay NameB¬
etc etc.
(ie, it has an internal name, '8', the value, the display name, followed by '¬' **). for example, you'd have FirstName8JoeFirst Name¬
I have no control over how this data is serialized, its legacy stuff.
I've thought of doing a bunch of splits on the string, or breaking it up into a char array and splitting down the text that way. But this just seems horrible. This way there is too much that could go wrong (e.g, if the value of a phone number (for example), could begin with '8'.
What I want to know is what peoples' approaches to this would be? Is there anything more clever i can do to break the data down
note: '¬' isn't actually the character, it looks more like an arrow pointing left. but I'm away from my machine at the moment. Doh!
Thanks.
Upvotes: 0
Views: 91
Reputation: 7449
Instead of using splits, I would recommend using a simple state machine. Walk over each characters until you hit a delimiter, then you know you're on the next field. That takes care of issues like an "8" in a phone number.
NOTE - untested code ahead.
var fieldValues = new string[3];
var currentField = 0;
var line = "InternalNameA8ValueDisplay NameA¬InternalNameB8ValueDisplay NameB¬";
foreach (var c in line)
{
if (c == '8' && currentField == 0)
{
currentField++; continue;
}
if (c == '¬')
{
currentField++; continue;
}
fieldValues[currentField] += c;
}
Dealing with wonky formats - always a good time!
Good luck, Erick
Upvotes: 1