Reputation: 32
so im working on a little piece of code that has to regex some stuff. in general its working except one regex which returns way to much stuff on the first match but works as intended on every other match afterwards.
I build the regex using regex101.com and it looks like this:
>(?<ev>[0-9]{1,3}\s(HP|Atk|Def|SpD|SpA|Spe))<
The code in c# looks like this:
pattern = ">(?<ev>^[0-9]+${1,3}\\s(HP|Atk|Def|SpD|SpA|Spe))<";
foreach (Match match in Regex.Matches(html, pattern))
{
Console.WriteLine(match.Groups["ev"].Captures[0].Value);
}
The html im regexing is this: http://pokedex.project-sato.net/files/localfile.html
Example that i also tested and doesnt work: http://pastebin.com/6H0QnFBP
Output in the console is empty.
EDIT:
The output should look like:
4 Def
252 SpA
252 Spe
Upvotes: 0
Views: 67
Reputation: 76577
Try adjusting your first matching group as you currently have [0-9]+{1,3}
which would match one or more digits, three times.
It's likely that you are want 1-3 digits along with one of your properties (i.e. HP
,Atk
, etc.) which would use the following expression :
// This will match 1-3 digits followed by one of your properties
pattern = ">(?<ev>\d{1,3}\\s(HP|Atk|Def|SpD|SpA|Spe))\<");
This also removes the beginning and constraints of ^
and $
as these values could be likely found throughout your input so you don't want to restrict it to a single line.
Example
You can see an example of this that uses a snippet of your targeted HTML file as input here.
Upvotes: 2