Reputation: 269
I have the following HTML:
<tr><td><font color="#306eff">P: </font>9283-1000<font color="#306eff">
OR (newline)
<tr><td><font color="#306eff">P: </font>9283-1000
<font color="#306eff">
I went to regexpal.com and entered the following regex:
P: </font>(.*?)<font
And it matches. But when I do it in Java, it doesn't match:
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
There are multiple regexes I tried on different occasions and they simply don't work in Java. Any suggestions? Thanks!
Upvotes: 0
Views: 757
Reputation: 2878
Dot does not match newline by default.
Use Pattern rP = Pattern.compile(">P: </font>(.*?)<font", Pattern.DOTALL);
Reference here.
Upvotes: 1
Reputation: 43073
Try this regex instead:
(?ims).*?>P: </font>(.*?)<font.+
Sample code
public static void main(String[] args) {
String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile("(?ims).*?>P: </font>(.*?)<font.+");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
}
Output
9283-1000
Upvotes: 0
Reputation: 4864
Try this :
String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
In java only difference is in escape character .
Upvotes: 0
Reputation: 39477
Your works fine for me.
public static void main(String[] args) {
String data = "<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\"> ";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);
if (mP.find()) {
System.out.println(mP.group(1).trim());
}
}
This prints: 9283-1000
.
I guess the problem may be in how data
is fed into the program.
Because the code itself is OK as you can see from this output.
Upvotes: 2