Saurabh Prajapati
Saurabh Prajapati

Reputation: 332

How to replace xml empty tags using regex

I have a lot of empty xml tags which needs to be removed from string.

 String dealData = dealDataWriter.toString();
 someData = someData.replaceAll("<somerandomField1/>", "");
 someData = someData.replaceAll("<somerandomField2/>", "");
 someData = someData.replaceAll("<somerandomField3/>", "");
 someData = someData.replaceAll("<somerandomField4/>", "");

This uses a lot of string operations which is not efficient, what can be better ways to avoid these operations.

Upvotes: 0

Views: 3716

Answers (4)

Mubashar
Mubashar

Reputation: 12668

If you like to remove <tagA></tagA> and also <tagB/> you can use following regex. Please note that \1 is used to back reference matching group.

// identifies empty tag i.e <tag1></tag> or <tag/>
// it also supports the possibilities of white spaces around or within the tag. however tags with whitespace as value will not match.
private static final String EMPTY_VALUED_TAG_REGEX = "\\s*<\\s*(\\w+)\\s*></\\s*\\1\\s*>|\\s*<\\s*\\w+\\s*/\\s*>";

Run the code on ideone

Upvotes: 0

Oneiros
Oneiros

Reputation: 4378

I would not suggest to use Regex when operating on HTML/XML... but for a simple case like yours maybe it is ok to use a rule like this one:

someData.replaceAll("<\\w+?\\/>", "");

Test: link

If you want to consider also the optional spaces before and after the tag names:

someData.replaceAll("<\\s*\\w+?\\s*\\/>", "");

Test: link

Upvotes: 1

Kallmanation
Kallmanation

Reputation: 1182

Alternatively to using regex or string matching, you can use an xml parser to find empty tags and remove them.

See the answers given over here: Java Remove empty XML tags

Upvotes: 0

Arjun
Arjun

Reputation: 1

Try the following code, You can remove all the tag which does not have any space in it.

someData.replaceAll("<\w+/>","");

Upvotes: 0

Related Questions