mdc
mdc

Reputation: 1201

How to parse this string in the best way

I have several strings that looks like this one:

\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t

I would like to get the date (2014-05-20 11:03 in my example - but will vary), Word C and D. (Both C and D can be any sequence of letters).

How would I parse this as efficient as possible? I was thinking about using the String.Replace method but I think a regex would be better? (C#)

Upvotes: 0

Views: 121

Answers (3)

garyh
garyh

Reputation: 2852

string input = @"\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t"; 

string pattern = @"(\d{4}\-\d{2}\-\d{2}\s\d{2}:\d{2})(?:[\\r\\n\\t]*StaticWord2[\\r\\n\\t]*)(\w+)\s(\w+)";

Match match = Regex.Match(input, pattern);

Then to get the values:

match.Groups[1].Value;  // date-time
match.Groups[2].Value;  // WordC
match.Groups[3].Value;  // WordD

Upvotes: 0

Perfect28
Perfect28

Reputation: 11317

Use this capture string :

Match match = Regex.Match(input,  @"(\d\d\d\d-\d\d-\d\d \d\d:\d\d)",
    RegexOptions.Multiline);
if (match.Success)
{
    string key = match.Groups[1].Value;
    DateTime date = DateTime.ParseExact(key, "yyyy-MM-dd HH:mm", CultureInfo.InvariantCulture); // Your result is here
}

Upvotes: 2

Alexein1
Alexein1

Reputation: 119

I don't know if it's the best way but you can use a split like in this msdn example : http://msdn.microsoft.com/en-us/library/ms228388.aspx

With this example you can easily create an array like in the example and split your string with \t \n \r ... and with a loop get all your words :

class TestStringSplit
{
    static void Main()
    {
        char[] delimiterChars = { '\r', '\n', '\t' };

        string text = "\r\n\t\StaticWord1:\r\n\t\t2014-05-20 11:03\r\n\t\StaticWord2\r\n\t\t\r\n\t\t\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t\t\t\t\t\t\tWordC WordD\r\n\t\t\t\t\t\t\t\t\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t";
        System.Console.WriteLine("Original text: '{0}'", text);

        string[] words = text.Split(delimiterChars);
        System.Console.WriteLine("{0} words in text:", words.Length);

        foreach (string s in words)
        {
            System.Console.WriteLine(s);
        }

        // Keep the console window open in debug mode.
        System.Console.WriteLine("Press any key to exit.");
        System.Console.ReadKey();
    }
}

Upvotes: 2

Related Questions