Blankman
Blankman

Reputation: 267390

Java regex, need help with escape characters

My HTML looks like:

<td class="price" valign="top"><font color= "blue">&nbsp;&nbsp;$&nbsp;      5.93&nbsp;</font></td>

I tried:

String result = "";
        Pattern p =  Pattern.compile("\"blue\">&nbsp;&nbsp;$&nbsp;(.*)&nbsp;</font></td>");

        Matcher m = p.matcher(text);

        if(m.find())
            result = m.group(1).trim();

Doesn't seem to be matching.

Am I missing an escape character?

Upvotes: 1

Views: 646

Answers (2)

Stephen C
Stephen C

Reputation: 719739

Unless escaped at the regex level, $ means match the end of line. And to get the single \ needed to escape the $ it needs to be escaped in the String literal; i.e. two \ characters. So ...

... Pattern.compile("\"blue\">&nbsp;&nbsp;\\$&nbsp;(.*)&nbsp;</font></td>");

But the folks who commented that you shouldn't use regexes to parse HTML are absolutely right!! Unless you want chronically fragile code, your code should use a strict or non-strict HTML parser.

Upvotes: 2

ZyX
ZyX

Reputation: 53674

May be you need to escape $ (I think, with two slashes)?

Upvotes: 1

Related Questions