user2320462
user2320462

Reputation: 269

regex not matching wildcard

I have the following HTML:

<tr><td><font color="#306eff">P: </font>9283-1000<font color="#306eff">&nbsp;&nbsp;

OR (newline)

<tr><td><font color="#306eff">P: </font>9283-1000

<font color="#306eff">&nbsp;&nbsp;

I went to regexpal.com and entered the following regex:

P: </font>(.*?)<font

And it matches. But when I do it in Java, it doesn't match:

    Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
    Matcher mP = rP.matcher(data);

    if (mP.find()) {
        System.out.println(mP.group(1).trim());
    }

There are multiple regexes I tried on different occasions and they simply don't work in Java. Any suggestions? Thanks!

Upvotes: 0

Views: 757

Answers (4)

aalku
aalku

Reputation: 2878

Dot does not match newline by default.

Use Pattern rP = Pattern.compile(">P: </font>(.*?)<font", Pattern.DOTALL);

Reference here.

Upvotes: 1

Stephan
Stephan

Reputation: 43073

Try this regex instead:

(?ims).*?>P: </font>(.*?)<font.+

Sample code

public static void main(String[] args) {
    String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
    Pattern rP = Pattern.compile("(?ims).*?>P: </font>(.*?)<font.+");
    Matcher mP = rP.matcher(data);

    if (mP.find()) {
          System.out.println(mP.group(1).trim());
    }
}

Output

9283-1000

Upvotes: 0

Sujith PS
Sujith PS

Reputation: 4864

Try this :

String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);

if (mP.find()) {
      System.out.println(mP.group(1).trim());
}

In java only difference is in escape character .

Upvotes: 0

peter.petrov
peter.petrov

Reputation: 39477

Your works fine for me.

    public static void main(String[] args) {
        String data = "<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
        Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
        Matcher mP = rP.matcher(data);

        if (mP.find()) {
            System.out.println(mP.group(1).trim());
        }
    }

This prints: 9283-1000.

I guess the problem may be in how data is fed into the program.
Because the code itself is OK as you can see from this output.

Upvotes: 2

Related Questions