Larsi
Larsi

Reputation: 4774

Regex to parse text between '

I've got the following text:

  1. Send Request to BizTalk. CaseID: '2011000264', Title: 'ArchiveDocument Poup - fields.docx', Date: '11.01.2013 13:15:28'
  2. Send Request to BizTalk. Title: 'Jallafields.docx', Date: '11.01.2013 13:15:28'

Now I would like to parse out the Title. I know this should be pretty straightforward, but I'm struggling, so any help would be very welcome.

Upvotes: 1

Views: 225

Answers (4)

sloth
sloth

Reputation: 101142

Just for some Regex/LINQ fun:

var s = "Send Request to BizTalk. CaseID: '2011000264', Title: 'ArchiveDocument Poup - fields.docx', Date: '11.01.2013 13:15:28'"   ;
var d = Regex.Matches(s, @"(?<=[\W])(\w*):\W'([^']*)'").OfType<Match>().ToDictionary (m => m.Groups[1].Value, m=>m.Groups[2].Value);

d is now

enter image description here

J̶u̶s̶t̶ ̶h̶o̶p̶e̶ ̶t̶h̶e̶r̶e̶'̶s̶ ̶n̶o̶ ̶̶'̶̶ ̶i̶n̶ ̶t̶h̶e̶ ̶t̶i̶t̶l̶e̶,̶ ̶t̶h̶o̶u̶g̶h̶.̶.̶.̶

To handle embedded single quotes, just replace the '([^']+)' part with '([^']+(?:\\'[^']*)*)', as fge suggests in his great answer:

Upvotes: 3

fge
fge

Reputation: 121810

Match your text against:

\bTitle: '([^']+)'

and capture the first group.

This, of course, supposes that there are no embedded single quotes... If there are, use the normal* (special normal*)* "regex pattern" like so (this example assumes such embedded quotes are escaped with a backslash):

\bTitle: '([^\\']+(?:\\'[^\\']*)*)'

Here, normal is [^\\'] (anything but a backslash or a single quote) and special is \\' (a backslash followed by a single quote). And this is the kind of thing which the often-used (overused?) lazy quantifiers cannot do ;)

Upvotes: 4

Oded
Oded

Reputation: 499212

Regex is overkill for this.

Use string.Split instead:

myString.Split('\'')[3]

To break it down a bit - myString.Split('\'') will split the string by the passed in character, ' in this case and return an array of results. I am using the fourth value in the array to retrieve the title - using the array subscript [3].

The above assumes a very strict structure to the string.


With the second example you posted, it is clear that the above approach will not work.

Upvotes: 1

Pranay Rana
Pranay Rana

Reputation: 176946

Parse the string like this will work for you

String s = " Send Request to BizTalk. CaseID: '2011000264', Title: 'ArchiveDocument Poup - fields.docx', Date: '11.01.2013 13:15:28'";

string[] all = s.Split(',');

foreach( string str in all)
{
  if(str.Contains("Title:"))
  {
     Console.Writeln( (str.Split(':'))[1]);
   }
}

Upvotes: 0

Related Questions