Reputation: 6645
I have a tagged file which could have the following records #
<test> <code> abcd </code> </test>
<test> efgh </test>
How do I extract one piece of test tag at a time .. which means I want to extract the test tag in both the situations above, whether it has only content or other nested tags too ...
Upvotes: 1
Views: 83
Reputation: 336108
Try
Pattern regex = Pattern.compile("<test>(.*?)</test>", Pattern.DOTALL);
This would fail, though, if <test>
tags themselves can be nested (<test> ... <test>...</test> ... </test>
).
The ?
makes the preceding *
quantifier lazy, i. e. it will match as few characters as possible and therefore only match one tag at a time.
Upvotes: 1
Reputation: 838076
Try the regular expression:
"\\bstart-tag:test\\s+(.*?)\\s+end-tag:test\\b"
The important point is that the ?
here means that the match should not be greedy, otherwise it can capture multiple tags.
Upvotes: 0