user2408588
user2408588

Reputation:

Extract xml from string using regular expression

I have following log file from server,I want to extract xml from following string.

2:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>

<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............

</HotelML>


3:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>

<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............

</HotelML>

5:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>

<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............

</HotelML>

I have written following regular expression for the same but it's matching only the first entry in the string.but i want to return all the xml string as collection.

(?<= Response:).*>.*</.*?>

Upvotes: 2

Views: 2181

Answers (2)

Alex Filipovici
Alex Filipovici

Reputation: 32541

Here's another approach which should leave you with a List<XDocument>:

using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
using System.Xml.Linq;

class Program
{
    static void Main(string[] args)
    {

        var input = File.ReadAllText("text.txt");
        var xmlDocuments = Regex
            .Matches(input, @"([0-9AMP: ]*>>Response: )")
            .Cast<Match>()
            .Select(match =>
                {
                    var currentPosition = match.Index + match.Length;
                    var nextMatch = match.NextMatch();
                    if (nextMatch.Success == true)
                    {
                        return input.Substring(currentPosition,
                            nextMatch.Index - currentPosition);
                    }
                    else
                    {
                        return input.Substring(currentPosition);
                    }
                })
            .Select(s => XDocument.Parse(s))
            .ToList();
    }
}

Upvotes: 1

Daren Thomas
Daren Thomas

Reputation: 70314

why aren't you matching from <HotelML to </HotelML?

something like:

<HotelML .*</HotelML>

Or, just go through the file line by line, and whenever you find a line matching

^.* PM >>Response:.*$

read the following lines as xml until the next matching line...

Upvotes: 2

Related Questions