Vikram
Vikram

Reputation: 7525

Remove XML Tag and Content in XML String using Java Regex

I have a XML String of 400 lines and it does consists of below tags repeated twice. I want to remove those tags

<Address>
<Location>Beach</Location>
<Dangerous>
    <Flag>N</Flag>
</Dangerous>
</Address>

I am using the below regex pattern but it's not replacing

xmlRequest.replaceAll("<Address>.*?</Address>$","");

I can able to do this in Notepad ++ by selecting [x].matches newline checkbox next to Regular Expression radio button in Find/Replace dialog box

Can anyone suggest what's wrong with my regular expression

Upvotes: 0

Views: 21533

Answers (3)

Raju
Raju

Reputation: 2972

A solution with JSoup

public static void main(String[] args){
    String XmlContent="<Address> <Location>Beach</Location><Dangerous> 
        <Flag>N</Flag> </Dangerous> </Address>";

    String tagToReplace="Address";
    String newValue="";

    Document doc = Jsoup.parse(XmlContent);
    ArrayList<Element> els =doc.getElementsByTag(tagToReplace);
    for(int i=0;i<els.size();i++){
        Element el = els.get(i);
        el.remove();
    }
    XmlContent=doc.body().children().toString();
}

Upvotes: 0

Kerwin
Kerwin

Reputation: 1212

xmlRequest.replaceAll("<Address>[\\s\\S]*?</Address>","");

.* don't contains the \n\r , so need use [\s\S] to match all

Upvotes: 8

b4n4n4p4nd4
b4n4n4p4nd4

Reputation: 70

As improper as it may be to do what you're suggesting. (See https://stackoverflow.com/a/1732454/6552039 for hilarity and enlightenment.)

You should be able to just ingest your xml with a org.w3c.dom.Document parser, then do a getElementsByTagName("Address"), and have it .remove(Element) the second one. (Assuming a particular interpretation of "below tags repeated twice".

Upvotes: 0

Related Questions