Pan
Pan

Reputation: 6645

Regular expression for the following

I have a tagged file which could have the following records #

<test> <code> abcd </code> </test>
<test> efgh </test> 

How do I extract one piece of test tag at a time .. which means I want to extract the test tag in both the situations above, whether it has only content or other nested tags too ...

Upvotes: 1

Views: 83

Answers (2)

Tim Pietzcker
Tim Pietzcker

Reputation: 336108

Try

Pattern regex = Pattern.compile("<test>(.*?)</test>", Pattern.DOTALL);

This would fail, though, if <test> tags themselves can be nested (<test> ... <test>...</test> ... </test>).

The ? makes the preceding * quantifier lazy, i. e. it will match as few characters as possible and therefore only match one tag at a time.

Upvotes: 1

Mark Byers
Mark Byers

Reputation: 838076

Try the regular expression:

"\\bstart-tag:test\\s+(.*?)\\s+end-tag:test\\b"

The important point is that the ? here means that the match should not be greedy, otherwise it can capture multiple tags.

Upvotes: 0

Related Questions