Reputation: 151
I'm writing a Regex pattern that filters through HTML tags and prints only the contents of valid tags for practice. While the pattern itself appears to be matching tags correctly, I am running into an issue when printing them.
import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;
public class HTMLPattern{
public static void main(String[] args){
Scanner in = new Scanner(System.in);
int testCases = Integer.parseInt(in.nextLine());
while(testCases>0){
String line = in.nextLine();
String tagPattern = "<([^>]+)>([^<]*?)</\\1>";
Pattern p = Pattern.compile(tagPattern, Pattern.MULTILINE);
Matcher m = p.matcher(line);
if(m.find()){
//checks if the output equals a newline
if(m.group(2).matches("[\\n\\r]+")){
System.out.println("None");
}else{
System.out.println(m.group(2));
}
}else{
System.out.println("None");
}
testCases--;
}
}
}
When inputting:
3
<a>test</a>
<b></b>
<c>test</c>
My output should be:
test
None
test
But instead it is:
test
test
My question is: Why is my if-statement not catching the newline and printing "None"?
Upvotes: 0
Views: 127
Reputation: 151
Turns out there are no newline(s) present in the if statement. While my previous attempts for checking if(m.group(2) == null)
failed, the .isEmpty() method correctly matched the null value I was testing for:
if(m.find()){
if(m.group(2).isEmpty()){
System.out.println("None");
}else{
System.out.println(m.group(2));
}
}else{
System.out.println("None");
}
Upvotes: 0
Reputation: 2570
There is no new line, there is just empty string, try to match empty string like this:
if (m.group(2).matches("^$")) {
Or check length
of string:
if (m.group(2).length() == 0) {
Upvotes: 2