Reputation: 595
I need a help with regular expression as I do not have good knowledge in it.
I have regular expression as:
Regex myregex = new Regex("testValue=\"(.+?)\"");
What does (.+?) indicate?
The string it matches is "testValue=123e4567"
and returns 123e4567
as output.
Now I need help in regular expression to match a string "<helpMe>123e4567</helpMe>"
where I need 123e4567 as output. How do I write a regular expression for it?
Upvotes: 1
Views: 157
Reputation: 23943
This means:
( Begin captured group
. Match any character
+ One or more times
? Non-greedy quantifier
) End captured group
In the case of your regex, the non-greedy quantifier ?
means that your captured group will begin after the first double-quote, and then end immediately before the very next double-quote it encounters. If it were greedy (without the ?
), the group would extend to the very last double-quote it encounters on that line (i.e., "greedily" consuming as much of the line as possible).
For your "helpMe" example, you'd want this regex:
<helpMe>(.+?)</helpMe>
Given this string:
<div>Something<helpMe>ABCDE</helpMe></div>
You'd get this match:
ABCDE
The value of the non-greedy quantifier is evident in this variation:
Regex: <helpMe>(.+)</helpMe>
String: <div>Something<helpMe>ABCDE</helpMe><helpMe>FGHIJ</helpMe></div>
The greedy capture would look like this:
ABCDE</helpMe><helpMe>FGHIJ
There are some useful interactive tools to play with these variations:
Upvotes: 4
Reputation: 101614
Ken Redler has a great answer regarding your first question. For the second question try:
<(helpMe)>(.*?)</\1>
Using the back reference \1
you can find values between the set of matching tags. The first group finds the tag name, the second group matches the content itself, and the \1
back reference re-uses the first group's match (in this case the tag name).
Also, in C# you can use named groups, like: <(helpMe)>(?<value>.*?)</\1>
where now match.Groups["value"].Value
contains your value.
Upvotes: 2
Reputation: 59012
What does (.+?) indicate?
It means match any character (.) one or more times (+?)
A simple regex to match your second string would be
<helpMe>([a-z0-9]+)<\/helpMe>
This will match any character of a-z
and any digit
inside <helpme>
and </helpMe>
.
The pharanteses are used to capture a group. This is useful if you need to reference the value inside this group later.
Upvotes: 0