petko_stankoski
petko_stankoski

Reputation: 10713

Regex not working in C#

Here is my regex:

href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))

And here is what I have:

"<p>dfhdfh</p>\r\n<p><a href=\"/Content/blabla/345/344\">najnov</a></p>\r\n<p>&nbsp;</p>\r\n<p><a href=\"/Content/blabla/345/323:test 1\">test 1&nbsp;</a></p>"

But m.Groups are:

{href="/Content/blabla/345/344"}
{/Content/blabla/345/344}

How to get the second href in m?

Here is my code:

Match m = Regex.Match(myString, "href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))", RegexOptions.IgnoreCase);
                if (m.Success)
                {
                    for (int ij = 0; ij < m.Groups.Count; ij++)
                        myString = myString.Replace(m.Groups[ij].Value.Substring(7), m.Groups[ij].Value.Substring(m.Groups[ij].Value.LastIndexOf("/") + 1));
                }

Upvotes: 1

Views: 1048

Answers (3)

Alan Moore
Alan Moore

Reputation: 75222

I'm going to assume the original string is this:

<p>dfhdfh</p>
<p><a href="/Content/blabla/345/344">najnov</a></p>
<p>&nbsp;</p>
<p><a href="/Content/blabla/345/323:test 1">test 1&nbsp;</a></p>

..and what you posted is the string literal you would use to create that string. Getting all the href attributes out of that is as simple as this:

Regex r = new Regex(@"href\s*=\s*(?:""(?<HREF>[^""]*)""|(?<HREF>\S+))");

foreach (Match m in r.Matches(htmlString))
{
  Console.WriteLine(m.Groups["HREF"].Value);
}

I changed the name of the capturing group to HREF to make it clear that we're retrieving the group by its name, not by its number.

As you can see, you're doing whole lotta work you don't need to do.

Upvotes: 0

stema
stema

Reputation: 92976

Apart from the html/regex stuff, to get all results at once, use Matches, that method returns a MatchCollection that contains all found Match objects.

See The MatchCollection and Match Objects on msdn.

Upvotes: 1

GShenanigan
GShenanigan

Reputation: 5493

From testing this using RAD software RegEx designer.

This regex returns multiple matches, with one group within each match. So you shouldn't be trying to get your result from the Group (named "1"), you should be iterating over the collection of matches and retrieving the value of each (or the group from within each).

This is the result that gets output:

output from RAD RegEx designer

So you should be calling Regex.Matches in your code, and iterate through the results, not Regex.Match.

Upvotes: 1

Related Questions