Reputation: 1
I am looking Java compataible regular expression to match only anchor tags which don't have href value same as link text
e.g 1 (Should not be matched)
<a href="http://www.google.co.in">http://www.google.co.in</a>
e.g 2 (Should be matched)
<a href="http://www.google.co.in">Google</a>
I have tried the following but it is not working as intended
<a(.*?)(?i)href\\s*=\\s*"([^"\\s]+)"(.*?)>(?=\\2)(.+?)</a>
Upvotes: 0
Views: 480
Reputation: 36304
Well, if you really want to do this, you have to capture the value of href first and then check if it exists later :
public static void main(String[] args) {
String s = "<a href=\"http://www.google.co.in\">http://www.google.co.in</a>";
System.out.println(s.matches("<a href=\"(.*?)\".*\\1.*"));
String s1 = "<a href=\"http://www.google.co.in\">http://www.google12.co.in</a>";
System.out.println(s1.matches("<a href=\"(.*?)\".*\\1.*"));
}
O/P :
true
false
Upvotes: 1