Robert Herzog
Robert Herzog

Reputation: 21

Remove XML node in Notepad++

I have a large XML with the structure below. Now, I want to get rid of the <tuv xml:lang="en-GB"><seg>CONTENT</seg></tuv> nodes, so for each unit only the de-DE part stays (<tuv xml:lang="de-DE"><seg>CONTENT</seg></tuv>). Is there a way to do this with Notepad++ or a different tool? I am not really into coding, so the simpler the better.

What I have:

<tu tuid="ID_0">
<tuv xml:lang="en-GB">
<seg>Hello!</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Hallo!</seg>
</tuv>
</tu>
<tu tuid="ID_1">
<tuv xml:lang="en-GB">
<seg>This is a test content! :)</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Das ist ein Testinhalt! :)</seg>
</tuv>
</tu>
<tu tuid="ID_2">
<tuv xml:lang="en-GB">
<seg>All your base are belong tu us ...</seg>
</tuv>
<tuv xml:lang="de-DE">
<seg>Och nö, echt jetzt?</seg>
</tuv>
</tu>

What I want:

<tu tuid="ID_0">
<tuv xml:lang="de-DE">
<seg>Hallo!</seg>
</tuv>
</tu>
<tu tuid="ID_1">
<tuv xml:lang="de-DE">
<seg>Das ist ein Testinhalt! :)</seg>
</tuv>
</tu>
<tu tuid="ID_2">
<tuv xml:lang="de-DE">
<seg>Och nö, echt jetzt?</seg>
</tuv>
</tu>

Upvotes: 2

Views: 2034

Answers (1)

Suresh Anbarasan
Suresh Anbarasan

Reputation: 1033

This can be accomplished by Notepad++ regex find and search.
Hit Ctrl+H to open Find/Replace dialog box

  • Find What : <tuv xml:lang="en-GB">\r\n.*\r\n.*\r\n
  • Replace With : (Leave It Blank)
  • Search Mode: Regular expression
  • Click Replace All

Upvotes: 1

Related Questions