Reputation: 62228
I am trying to create a regular expression to find incorrect uses of resources in my html file. For this specific test I want to return strings where the string matches @@[a-zA-Z0-9\._\-]*
but does not start with i18n="
. This is what I have so far
/(?!i18n=")@{2}[a-zA-Z0-9\._\-]+/gi
I can't seem to get the negation to work though. These are my test strings but my expression finds a match on the first line which is not what I am expecting.
This one is correct and should not be returned i18n="@@DashboardResources.ProgramContentTitle"
Bad, should be returned @@DashboardResources.ProgramContentTitle
@@DashboardResources.ProgramContentTitle is Bad and should be returned.
Bad, should be returned @@DashboardResources.ProgramContentTitle"
Bad, should be returned i8@@DashboardResources.ProgramContentTitle
I am using regex101 to test. Any pointers would be most appreciated.
Should it matter, the end use will be in a c# application using the System.Text.RegularExpressions.Regex
class.
Upvotes: 1
Views: 45
Reputation: 626926
First of all, if you want to test a .NET regex, do that at RegexStorm.net or RegexHero.net since regex101 does not support the .NET regex syntax. Certainly, regex101 is great as it explains the pattern automatically, but Ultrapico Expresso tool will do the same for you.
Next, the (?!i18n=")
construct is a lookahead and it matches a position that is not immediately followed by i18n="
substring. What you need is a negative lookbehind, (?<!i18n=")
, that will match a location that is not immediately preceded with a i18n="
substring.
var matches = Regex.Matches(s, @"(?<!i18n="")@@[a-zA-Z0-9._-]+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
If you share more details on what the document looks like and what the requirements are, there might be a different, better soltution here. For a more solid approach to analyze HTML, you may use HtmlAgilityPack and similar libraries.
Upvotes: 1