Reputation: 313
ive got a string which goes like
[...] <a rel=\"nofollow\" class=\"username offline popupctrl\" href=\"http://....html\" title=\"T3XTT0F1ND is offline\" id=\"...\">\">\">\">"[...]
If i set the pattern to
"<a rel=\"nofollow\" (.+) id=\"(.+)(?=\")"
i get T3XTT0F1ND">">"> instead of just T3XTT0F1ND at Groups[2].Value. How can i set the RegEx to not only find the first possible occurrence of 'a rel="nofollow"...' but also of 'id="' ?
Upvotes: 0
Views: 1237
Reputation: 34385
This works for A tags where the ID attribute always follows the REL attribute. The ID value is captured into capture group 1:
Regex regexObj = new Regex(
@"<a\b # Open start tag delimiter
[^>]*? # Everything up to REL attrib
\b rel=""nofollow"" # REL attrib.
[^>]*? # Everything up to ID attrib
\b id=""([^""]*)"" # $1: ID attrib.
[^>]* # Everything up to end of start tag.
> # Close start tag delimiter",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Groups[1].Value);
matchResult = matchResult.NextMatch();
}
Upvotes: 0
Reputation: 7103
Shouldn't you make one more () for the title, like
<a rel=\"nofollow\" (.+) title=\"(.+)\" id=\"(.+)(?=\")
This would result in Groups[2] returning T3XTT0F1ND is offline.
Moreover, you meant that your id is equal T3XTT0F1ND and your Groups captures more than this? If the answer is yes then you may try the regexp below
<a rel=\"nofollow\" (.+) id=\"(.+)[^>]\"
Upvotes: 1