Cratylus
Cratylus

Reputation: 54074

Java:What is wrong with this regex?

I am trying to get the text inside a tag i.e. <text>. I am doing:

Pattern pattern = Pattern.compile("(?<=\\<).*(?=\\>)");

I think that this says: any character 0 or more times that before is a < (positive lookbehind) and followed by > (positive lookahead).

Matcher m = pattern.matcher(data);  
if (!m.matches()) continue; //Called in a for loop  

But there is no match for eg the input <text> some other stuff here.

What am I doing wrong here?

Upvotes: 0

Views: 67

Answers (4)

Jaco Van Niekerk
Jaco Van Niekerk

Reputation: 4182

I don't quite understand your regular expression, but this works for me:

String text = "<text>";
Pattern p = Pattern.compile(".*<(.*)>.*");
Matcher m = p.matcher(text);
System.out.println(m.matches());
System.out.println(m.group(1));

this displays:

true
text

Is that what you need?

Upvotes: 0

Thomas
Thomas

Reputation: 88707

Don't use m.matches() but m.find().

From the JavaDoc on matches():

Attempts to match the entire region against the pattern.

Upvotes: 5

waxwing
waxwing

Reputation: 18743

When you are using matches(), the entire input string must match the expression. If you want to find substrings, you may use find() instead.

Upvotes: 5

Eugene
Eugene

Reputation: 120848

You can try this to match:

public static void main(String[] args) {
    String input = "<text> Some Value </text> <a>  <testTag>";
    Pattern p = Pattern.compile("<(\\w.*?)>");
    Matcher m = p.matcher(input);

    while(m.find()){
       System.out.println(m.group(1));
    }
}

Upvotes: 1

Related Questions