What regex could I use to extract a body of XML text from a body of unformatted text?

Question

Let's say I have the following body of text:

Call me Ishmael. Some years ago- never mind how long precisely- having little 
or no money in my purse, and nothing particular to interest me on shore, I 
thought I would sail about a little and see the watery part of the world. It is  


   
   

a way I have of driving off the spleen and regulating the circulation. Whenever  
I find myself growing grim about the mouth; whenever it is a damp, drizzly 
November in my soul;

What regex could I use that would return to me the XML embedding in the string?

NOTE: I can assume that and will always have the same name.

SLaks · Accepted Answer

If you know that the root element will always be and that there will never be a nested tag, you can do it like this:

\<\?xml .+?\

This regex will lazily match all text between and .

What regex could I use to extract a body of XML text from a body of unformatted text?

Answers (2)

Related Questions