Reputation: 1
Possible duplicate: RegEx matching HTML tags and extracting text
I need to get the text between the html tag like <p></p>
or whatever. My pattern is this
Pattern pText = Pattern.compile(">([^>|^<]*?)<");
Anyone knows some better pattern, because this one its not very usefull. I need it to get for index the content from web page.
Thanks
Upvotes: 0
Views: 2561
Reputation: 700222
It looks like you are trying to use the |
operator inside a negative set, which is neither working nor needed. Just specify the characters that you don't want to match:
Pattern pText = Pattern.compile(">([^<>]*?)<");
Upvotes: 3