split the parent element if the keyword found in xml file using perl

Question

Question Updated

I have a keyword tag in the xml file. Base on this I need to split the elements to be closed which has been opened and Also I need to open DUMMY OPENING TAGS which we are adding the closing tags on the keyword elements.

For eg. Input:


    The para sample lines...
      
      ..... .... 
      ..... .... 
      .....  .... 
      
     The para sample lines..

Expected Output:


    The para sample lines...
      
      ..... .... 
      ..... .... 
      ..... 
      
   

**
 
   
      
       .... 
      
      The para sample lines..

Note: Asterisks for just identification purpose only (need to delete the tag)

I am very new in using Module based on the Markup Languages. Could someone help me to get the idea. (I am also trying...)

Sobrique · Accepted Answer

Here's an example of how you could do this using XML::Twig:

#!/usr/bin/env perl
use strict;
use warnings;

use XML::Twig;

my $first_doc = XML::Twig -> parse ( \*DATA ); 

my $second_doc = XML::Twig -> new; 
$second_doc -> set_root ( $first_doc -> root -> copy ); #create a copy. 

while ( my $after_split = $first_doc -> get_xpath('//split',0)->next_sibling ) {
   $after_split -> delete;
}

$first_doc -> get_xpath('//split',0) -> delete; # delete split tag.

while ( my $before_split = $second_doc -> get_xpath('//split',0)->prev_sibling ) {
   $before_split -> delete;
}

$second_doc -> get_xpath('//split',0) -> delete; # delete split tag. 

$first_doc -> set_pretty_print ('indented_a');
$first_doc -> print;

print "
--- second doc ---
"; 
$second_doc -> set_pretty_print ('indented_a');
$second_doc -> print;


__DATA__

   
      
       sample content for first doc  second doc sample content

This gives you as output:


  
    
      
         sample content for first doc 
      
    
  


--- second doc ---

  
    
      
         second doc sample content

You will probably want to look at parsefile and sprint from XML::Twig to handle reading your own file, and generating output.

Note - this does a 'full split' of the document into essentially two separate documents - but this technique should work withing a subtree, because the core of it is locating your split element, and deleting everything before or after it as necessary.

split the parent element if the keyword found in xml file using perl

Answers (1)

Related Questions