Prix
Prix

Reputation: 19528

Parse string/stringbuilder into class, how should I go about it?

All the data I have to parse is currently stored into a StringBuilder and I would like to parse it into my class list:

StringBuilder data = new StringBuilder(length);

So I have my class assigned to a list:

public class Messages
{
    public DateTime Sent { get; set; }
    public string User {get; set; }
    public MessageType TypeUsed { get; set; }
    public string Message { get; set; }
}

public enum MessageType
{
    System,
    Info,
    Warn
}

public List<Messages> myList = new List<Messages>();

Now here are some messages samples that I need to parse:

[13:49:13] [System Message] <Username>  has openned blocked website 
[13:49:14] <Username> accessed file X
[13:52:46] [System Message] <Username>  has entered this room 
[13:52:49] [System Message] <Username>  has left this room 

My doubt here is what would be the best way to parse it.

Time is present in all messages. Usernaem is always with <> When there is no [System Message] or [Warn Message] it is a Info type message. Message is the rest example:

has left this room
accessed file X
has openned blocked website

Now here is where I am still thinking what to use.

I could use a regex to extract each string something like this:

Regex getData = new Regex(@"^\[(\d{1,2}:\d{1,2}:\d{1,2})\] \[([A-Za-z]+)\] ");

But then I would basicly need to make several checks for each message so I was not so confortable with it.

Tought about using split for example:

string line = item.Replace("[", "").Replace("]", "");
string[] fields = line.Split(' ');

and then I would check the split cases would be easy to detect the MessageType but not so reliable I think.

I would like some advices and ideas of how I could go along with this ?

Maybe I am just overcomplicating the logic :/

Upvotes: 1

Views: 1137

Answers (1)

Jon
Jon

Reputation: 437474

A regex is probably most convenient here. Try this one:

^\[(\d{2}:\d{2}:\d{2})\]\s*(\[(System|Warn)[\w\s]*\])?\s*<([^>]*)>\s*(.*)$

Translation:

  • Starting at the beginning of the line, match [##:##:##] into capture group 1
  • Then optionally match the System/Warn specifiers into capture group 2 and 3 (2 holds all the text in brackets, 3 only the System/Warn keyword)
  • Then capture the username inside angle brackets into capture group 4
  • And finally the message text in group 5

By testing the contents of group 2 or 3 for each line you know what type of message it is. All the other fields are ready to use straight from the capture groups.

Update:

Here's sample code as per the above:

var regex = new Regex(@"^\[(\d{2}:\d{2}:\d{2})\]\s*(\[(System|Warn)[\w\s]*\])?\s*<([^>]*)>\s*(.*)$");
var input = new[]
    {
        "[13:49:13] [System Message] <Username>  has openned blocked website", 
        "[13:49:14] <Username> accessed file X",
        "[13:52:46] [System Message] <Username>  has entered this room",
        "[13:52:49] [System Message] <Username>  has left this room"
    };

foreach (var line in input) {
    var match = regex.Match(line);
    if (!match.Success) {
        throw new ArgumentException();
    }

    Console.WriteLine("NEW MESSAGE:");
    Console.WriteLine("     Time: " + match.Groups[1]);
    Console.WriteLine("     Type: " + match.Groups[2]);
    Console.WriteLine("     User: " + match.Groups[4]);
    Console.WriteLine("     Text: " + match.Groups[5]);

}

Upvotes: 2

Related Questions