Reputation: 3134
I was trying to match the example
in ,
<p><a href="example/index.html">LinkToPage</a></p>
With rubular.com I could get something like <a href=\"(.*)?\/index.html\">.*<\/a>
.
I'll be using this in Pattern.compile
in Java
. I know that \
has to be escaped as well, and I've come up with <a href=\\\"(.*)?\\\/index.html\\\">.*<\\\/a>
and a few more variations but I'm getting it wrong. I tested on regexplanet. Can anyone help me with this?
Upvotes: 0
Views: 235
Reputation: 39354
You can tell Java what to match and call Pattern.quote(str) to make it escape the correct things for you.
Upvotes: 0
Reputation: 5547
Pattern.compile("<a href=\"(.*)?/index.html\">.*</a>");
That should fix your regex. You do not need to escape the forward slashes.
However I am obligated to present you with the standard caution against parsing HTML with regex:
RegEx match open tags except XHTML self-contained tags
Upvotes: 1
Reputation: 143354
Use "<a href=\"(.*)/index.html\">.*</a>"
in your Java code.
You only need to escape "
because it's a Java string literal.
You don't need to escape /
, because you aren't delimiting your regex with slashes (as you would be in Ruby).
Also, (.*)?
makes no sense. Just use (.*)
. *
can already match "nothing", so there's no point in having the ?
.
Upvotes: 2