Phyllis
Phyllis

Reputation: 21

Delete tags between specific tags in XML (Notepad++)

I already used the search function but I didn't find a answer to my question.

I have a XML structure like the following (example):

<Task name="1B">
 <Person type ="XX" name="YY" height="ZZ"/>
 <Person type ="XX" name="YY" height="ZZ"/> 
 <Person type ="XX" name="YY" height="ZZ"/> 
 </Task>

 <Task name="1C">
 <Person type ="XX" name="YY" height="ZZ"/>
 <Person type ="XX" name="YY" height="ZZ"/> 
 <Person type ="XX" name="YY" height="ZZ"/> 
 </Task>

Now I want to delete via Notepad++ the tag with Name "1B" and all the tags between the open and closing tag. Is there a way in Notepad? I already tried with RegEx Pattern but I didn't get the right way.

Upvotes: 1

Views: 2152

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626946

Using regex with HTML is strongly discouraged since that leads to many issues and unnecessary questions. See RegEx match open tags except XHTML self-contained tags. Using XSLT to transform XML is the tool you really need here.

Create a UTF8 encoded file with a remove_xml_tag.xsl sample name and paste this into it:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="Task[@name='1B']"/>
</xsl:stylesheet>

The XSL processes each node and attribute (node()|@*) and when it encounters a Task element with name attribute equal to 1B ("Task[@name='1B']") it just does not write it into the output.

Then run the XML Tools plugin -> XSL Transformation. You will see:

enter image description here

Click the ... button on the right and browse the XSL file.

Click the Transform button.

A fallback solution in case you have a malformed XML that will work only if you have no nested Task nodes:

<Task\s+name="1B">[^<]*(?:<(?!/Task>)[^<]*)*</Task>

Upvotes: 1

Related Questions