Reputation: 19528
All the data I have to parse is currently stored into a StringBuilder and I would like to parse it into my class list:
StringBuilder data = new StringBuilder(length);
So I have my class assigned to a list:
public class Messages
{
public DateTime Sent { get; set; }
public string User {get; set; }
public MessageType TypeUsed { get; set; }
public string Message { get; set; }
}
public enum MessageType
{
System,
Info,
Warn
}
public List<Messages> myList = new List<Messages>();
Now here are some messages samples that I need to parse:
[13:49:13] [System Message] <Username> has openned blocked website
[13:49:14] <Username> accessed file X
[13:52:46] [System Message] <Username> has entered this room
[13:52:49] [System Message] <Username> has left this room
My doubt here is what would be the best way to parse it.
Time is present in all messages.
Usernaem is always with <>
When there is no [System Message]
or [Warn Message]
it is a Info type message.
Message is the rest example:
has left this room
accessed file X
has openned blocked website
Now here is where I am still thinking what to use.
I could use a regex to extract each string something like this:
Regex getData = new Regex(@"^\[(\d{1,2}:\d{1,2}:\d{1,2})\] \[([A-Za-z]+)\] ");
But then I would basicly need to make several checks for each message so I was not so confortable with it.
Tought about using split for example:
string line = item.Replace("[", "").Replace("]", "");
string[] fields = line.Split(' ');
and then I would check the split cases would be easy to detect the MessageType but not so reliable I think.
I would like some advices and ideas of how I could go along with this ?
Maybe I am just overcomplicating the logic :/
Upvotes: 1
Views: 1137
Reputation: 437474
A regex is probably most convenient here. Try this one:
^\[(\d{2}:\d{2}:\d{2})\]\s*(\[(System|Warn)[\w\s]*\])?\s*<([^>]*)>\s*(.*)$
Translation:
By testing the contents of group 2 or 3 for each line you know what type of message it is. All the other fields are ready to use straight from the capture groups.
Update:
Here's sample code as per the above:
var regex = new Regex(@"^\[(\d{2}:\d{2}:\d{2})\]\s*(\[(System|Warn)[\w\s]*\])?\s*<([^>]*)>\s*(.*)$");
var input = new[]
{
"[13:49:13] [System Message] <Username> has openned blocked website",
"[13:49:14] <Username> accessed file X",
"[13:52:46] [System Message] <Username> has entered this room",
"[13:52:49] [System Message] <Username> has left this room"
};
foreach (var line in input) {
var match = regex.Match(line);
if (!match.Success) {
throw new ArgumentException();
}
Console.WriteLine("NEW MESSAGE:");
Console.WriteLine(" Time: " + match.Groups[1]);
Console.WriteLine(" Type: " + match.Groups[2]);
Console.WriteLine(" User: " + match.Groups[4]);
Console.WriteLine(" Text: " + match.Groups[5]);
}
Upvotes: 2