NepSyn14
NepSyn14

Reputation: 163

Regular Expression throwing up errors C#

I have a line of text for which I am creating a regular expression. I have used rexex101.com to check against and the resulting regular expression I created is error free. This is the line of text...

    <Msg Date="2015/04/29" Time="12:13:39:187" DateReceived="2015/04/29" TimeReceived="12:13:39:187"><Layer Name="MC"><SourceLayer Name="GUI" /><Message Name="OperatorLogin" Id="1" Status="Successful" /></Layer></Msg>

This is the regular expression...

    [<][a-zA-Z]\w+\s[a-zA-Z]\w+[=]"(?<date>(?<year>(?:\d{4}|\d{2})[\/\-](?<month>\d{1,2})[\/\-](?<day>\d{1,2})))"\s[a-zA-Z]\w+[=]"(?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2}:(?<milli>\.?\d{0,3})))"\s[a-zA-Z]\w+[=]"(?<date2>(?<year2>(?:\d{4}|\d{2})[\/\-](?<month2>\d{1,2})[\/\-](?<day2>\d{1,2})))"\s[a-zA-Z]\w+[=]"(?<time2>(?<hour2>\d{2}):(?<minutes2>\d{2}):(?<seconds2>\d{2}:(?<milli2>\.?\d{0,3})))"[>](?<logEntry>.*)

However, when I bring it into my program it throws up errors such as: 'Unexpected character' 'Invaid expresson term' 'Unrecognised escape sequence' I thought using the @ symbol at the beginning would prevent it reading slashes as escape characters etc.

This is how it looks within the program...

                string strRegXPattern = @"([<][a-zA-Z]\w+\s[a-zA-Z]\w+[=]["'](?<date>(?<year>(?:\d{4}|\d{2})[\/\-](?<month>\d{1,2})[\/\-](?<day>\d{1,2})))["']\s[a-zA-Z]\w+[=]["'](?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2}:(?<milli>\.?\d{0,3})))["']\s[a-zA-Z]\w+[=]["'](?<date2>(?<year2>(?:\d{4}|\d{2})[\/\-](?<month2>\d{1,2})[\/\-](?<day2>\d{1,2})))["']\s[a-zA-Z]\w+[=]["'](?<time2>(?<hour2>\d{2}):(?<minutes2>\d{2}):(?<seconds2>\d{2}:(?<milli2>\.?\d{0,3})))["'][>](?<logEntry>.*))";

I don't understand this? I wonder if it has something to do with the quotation marks "" or angle brackets <> I have tried putting them into [], () ["'] etc but it makes no difference. Can anyone see where I am going wrong? Thank you.

Upvotes: 1

Views: 88

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626893

What about using XElement and parse the data as XML? Your data looks a valid XML.

var xelement = System.Xml.Linq.XElement.Parse("<Msg Date=\"2015/04/29\" Time=\"12:13:39:187\" DateReceived=\"2015/04/29\" TimeReceived=\"12:13:39:187\"><Layer Name=\"MC\"><SourceLayer Name=\"GUI\" /><Message Name=\"OperatorLogin\" Id=\"1\" Status=\"Successful\" /></Layer></Msg>");
var reslt = xelement.DescendantsAndSelf("Msg");
var time = reslt.Where(p => p.HasAttributes && p.Attributes("Time") != null).Select(p => p.Attribute("Time").Value).FirstOrDefault();
var date = reslt.Where(p => p.HasAttributes && p.Attributes("Date") != null).Select(p => p.Attribute("Date").Value).FirstOrDefault();
var dateReceived = reslt.Where(p => p.HasAttributes && p.Attributes("DateReceived") != null).Select(p => p.Attribute("DateReceived").Value).FirstOrDefault();

Output:

enter image description here enter image description here enter image description here

And you can manipulate further using DateTime.Parse or DateTime.TryParse.

Example:

enter image description here

Upvotes: 1

AlexD
AlexD

Reputation: 32576

According to the standard (emphasis mine):

In a verbatim string literal, the characters between the delimiters are interpreted verbatim, the only exception being a quote-escape-sequence.

So try changing " with "":

string strRegXPattern = @"([<][a-zA-Z]\w+\s[a-zA-Z]\w+[=][""'](?<date>(?<year>(?:\d{4}|\d{2})[\/\-](?<month>\d{1,2})[\/\-](?<day>\d{1,2})))[""']\s[a-zA-Z]\w+[=][""'](?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2}:(?<milli>\.?\d{0,3})))[""']\s[a-zA-Z]\w+[=][""'](?<date2>(?<year2>(?:\d{4}|\d{2})[\/\-](?<month2>\d{1,2})[\/\-](?<day2>\d{1,2})))[""']\s[a-zA-Z]\w+[=][""'](?<time2>(?<hour2>\d{2}):(?<minutes2>\d{2}):(?<seconds2>\d{2}:(?<milli2>\.?\d{0,3})))[""'][>](?<logEntry>.*))";

Upvotes: 5

DrKoch
DrKoch

Reputation: 9772

In a C# string literal which starts with @ there is just one special character: " if you need this character you have to escape it with another ":

so your regexp should look like this:

string strRegXPattern = @"([<][a-zA-Z]\w+\s[a-zA-Z]\w+[=][""'](?...

Note the double double quotes.

BUT

What you are trying to read is an xml string. You should use a xml library to read this. Do nbot reinvent the wheel.

Upvotes: 1

musefan
musefan

Reputation: 48415

If you are using a verbatim string, i.e. @"" then you need to escape quotes by doubling them up...

So: " becomes ""

string strRegXPattern = @"([<][a-zA-Z]\w+\s[a-zA-Z]\w+[=][""'](?<date>(?<year>(?:\d{4}|\d{2})[\/\-](?<month>\d{1,2})[\/\-](?<day>\d{1,2})))[""']\s[a-zA-Z]\w+[=][""'](?<time>(?<hour>\d{2}):(?<minutes>\d{2}):(?<seconds>\d{2}:(?<milli>\.?\d{0,3})))[""']\s[a-zA-Z]\w+[=][""'](?<date2>(?<year2>(?:\d{4}|\d{2})[\/\-](?<month2>\d{1,2})[\/\-](?<day2>\d{1,2})))[""']\s[a-zA-Z]\w+[=][""'](?<time2>(?<hour2>\d{2}):(?<minutes2>\d{2}):(?<seconds2>\d{2}:(?<milli2>\.?\d{0,3})))[""'][>](?<logEntry>.*))";

Visual studio should make it very obvious where these quote characters are, as the style will change as it thinks the string has ended.

Upvotes: 2

Related Questions