Timothy Harding
Timothy Harding

Reputation: 377

Remove XML Node, including children if present, via text editor

I'd like to remove a tag/node entirely (including children) in notepad++ (or other FOSS text editor) that may/may not have children, and possibly grandchildren, etc... I've tried regular expressions (noted on a few other SO Questions) but having trouble with the multiline aspect of these nodes/tags.

<exampleTag id="blah" name="bob">
    <childTag possible="element" />
    <moreChildren>
        <evenAnotherLevel />
    </moreChildren>
</exampleTag>

It is funny that both textWrangler and notepad++ can collapse a node for easier reading:

enter image description here

Which makes it easy to delete the whole thing manually, but this won't work for a file with potentially 10000s or more of these tags. Is there a tool/plugin out there that can do this? Right now I break out node.js to get this done, but that isn't a solution for laymen.

Upvotes: 2

Views: 2541

Answers (2)

Jim Grace
Jim Grace

Reputation: 111

On text editors such as BBEdit, TextWrangler and others that use PCRE (Perl Compatible Regular Expressions), you can set the "Magic Dot" option (allows . to match \r and \n) by putting at the front of your search (?s)

Also, when looking for the closing XML tag like be sure to use non-greedy searches by using ? after any pattern like .* that could otherwise match the end tag.

So for example in TextWrangler you can search for

(?s)^    <exampleTag code="[0-9]*" name="[0-9]* - .*?</exampleTag>$.

and replace it with nothing. This would find every <exampleTag starting 4 spaces after a newline with a numeric code and with a name containing a sequence of digits - anything (non-greedy) until the end </exampleTag> followed by a newline. The final . at the end assures that the newline will be deleted too. (On Windows you might need two dots for the cr-lf.)

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626932

You can use XSLT transformation directly from Notepad++. So, you need to remove specific nodes with all the inner XML inside. Here is a template you can use or adjust to your needs:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="exampleTag"/> <!-- The tags we want to remove are here -->
</xsl:stylesheet>

Now, save this as an *.xslt file on the disk. Then, open your XML in Notepad++ and go to Plugins -> XML Tools -> XSL Transformation. Provide the path to your XSLT file:

enter image description here

And click Transform.

Upvotes: 0

Related Questions