remove all occurences of a specific attribute from a XML

Question

I have a XML file with content like


  
    
     
    gutkha
    split_identifier ,
    and
    what
    role
    split_identifier ,
    if
    any
    split_identifier ,
    nicotine
    contributes
    to
    the
    effects
    split_identifier .
    Adult
    male
    mice
    were
    treated
    daily
    for

I want to remove all occurences of "ExposureSentence" attribute. Output would be

  gutkha
    split_identifier ,
    and
    what
    role
    split_identifier ,
    if
    any
    split_identifier ,
    nicotine
    contributes
    to
    the
    effects
    split_identifier .
    Adult
    male
    mice
    were
    treated
    daily
    for

I tried following, but not sure how to proceed futher.

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(new ByteArrayInputStream(xml.getBytes()));
        NodeList sectionNodeList = doc.getElementsByTagName("section");
        for (int i = 0; i < sectionNodeList.getLength(); i++)
        {
            Node sectionNode = sectionNodeList.item(i);

        }

Sean Bright · Accepted Answer

XPath makes this straightforward:

public static void main(String... args)
        throws Exception
{
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
    Document doc = dBuilder.parse(new ByteArrayInputStream(xml.getBytes()));

    XPathFactory xPathfactory = XPathFactory.newInstance();
    XPath xpath = xPathfactory.newXPath();

    // Find word elements with ExposureSentence attribute
    XPathExpression query = xpath.compile("//word[@ExposureSentence]");
    NodeList words = (NodeList) query.evaluate(doc, XPathConstants.NODESET);
    for (int i = 0; i < words.getLength(); i++) {
        // Remove the attribute
        ((Element) words.item(i)).removeAttribute("ExposureSentence");
    }

    // Handle ComponentName
    query = xpath.compile("//ComponentName");
    NodeList componentNames = (NodeList) query.evaluate(doc, XPathConstants.NODESET);
    for (int i = 0; i < componentNames.getLength(); i++) {
        String content = componentNames.item(i).getTextContent();
        componentNames.item(i).setTextContent(
            Arrays.stream(content.split(","))
                .map(String::trim)
                .filter(s -> !s.equals("ExposureSentence"))
                .collect(Collectors.joining(", ")));
    }

    // Omitted: Save the XML
}

remove all occurences of a specific attribute from a XML

Answers (2)

Related Questions