Reputation: 99
I'm using regular regular expressions in C# and have following string:
<tr>
<td class="uk-text-bold">Hello</td>
</tr>
<tr>
<td class="uk-text-bold">World</td>
</tr>
Using this pattern:
<td class=\"uk-text-bold\">(.+?)</td>
I'm trying to get just "Hello" and "World", so everything between brackets, but it keeps returning the full line and I'm stuck.
Can I get some advice?
Regular expression here.
Thanks in advance.
Upvotes: 0
Views: 86
Reputation: 231
Your expression is ok. So, if you really need to use Regex, I reccomend you to use named groups instead of numbered ones and then iterate through matches and process that named group. For example:
var pattern = @"<td class=\""uk-text-bold\"">(?<mostwanted>.+?)</td>";
var input = @"<tr>
<td class=""uk-text-bold"">Hello</td>
</tr>
<tr>
<td class=""uk-text-bold"">World</td>
</tr>";
var regex = new Regex(pattern, RegexOptions.Multiline);
var matches = regex.Matches(input);
foreach (var mostwanted in matches
.Cast<Match>()
.Select(t1 => t1.Groups["mostwanted"].Value))
{
Console
.WriteLine(mostwanted);
}
But, as the others says, better way is to use some html parser (HtmlAgilityPack is very good). Because, if your html code will contains spaces between tags or there will be some unwanted returns, your regex will be broken.
Upvotes: 1
Reputation: 6538
Your regex is good. To get your value you must iterate through groups
To parse Html you should use a dedicated library without using regex. You can take a look here for using HTML agility pack : http://www.c-sharpcorner.com/UploadFile/9b86d4/getting-started-with-html-agility-pack/
Upvotes: 1
Reputation: 65116
Once you have a Match m
, use m.Groups[1].Value
instead of m.Value
. Each pair of parenthesis define a new group.
But to tell you how you should really do it, use an HTML parsing library for parsing HTML, not regex.
Upvotes: 1