jlupi
jlupi

Reputation: 35

Java Regex - exclude empty tags from xml

let's say I have two xml strings:

String logToSearch = "<abc><number>123456789012</number></abc>"

String logToSearch2 = "<abc><number xsi:type=\"soapenc:string\" /></abc>"

String logToSearch3 = "<abc><number /></abc>";

I need a pattern which finds the number tag if the tag contains value, i.e. the match should be found only in the logToSearch.

I'm not saying i'm looking for the number itself, but rather that the matcher.find method should return true only for the first string.

For now i have this: Pattern pattern = Pattern.compile("<(" + pattrenString + ").*?>", Pattern.CASE_INSENSITIVE); where the patternString is simply "number". I tried to add "<(" + pattrenString + ")[^/>].*?> but it didn't work because in [^/>] each character is treated separately.

Thanks

Upvotes: 1

Views: 1615

Answers (2)

Matt K
Matt K

Reputation: 13872

So a search for "<number[^/>]*>" would find the opening tag. If you want to be sure it isn't empty, try "<number[^/>]*>[^<]" or "<number[^/>]*>[0-9]"

Upvotes: 0

Stefan Kendall
Stefan Kendall

Reputation: 67892

This is absolutely the wrong way to parse XML. In fact, if you need more than just the basic example given here, there's provably no way to solve the more complex cases with regex.

Use an easy XML parser, like XOM. Now, using xpath, query for the elements and filter those without data. I can only imagine that this question is a precursor to future headaches unless you modify your approach right now.

Upvotes: 1

Related Questions