Reputation: 2589
Question Updated
I have a keyword tag <split/>
in the xml file. Base on this I need to split the elements to be closed which has been opened and Also I need to open DUMMY OPENING TAGS which we are adding the closing tags on the keyword elements.
For eg. Input:
<section>
<para> The para sample lines...
<list>
<list-item><para> ..... .... </para></list-item>
<list-item><para> ..... .... </para></list-item>
<list-item><para> ..... <split/> .... </para></list-item>
</list>
The para sample lines.. </para>
</section>
Expected Output:
<section>
<para> The para sample lines...
<list>
<list-item><para> ..... .... </para></list-item>
<list-item><para> ..... .... </para></list-item>
<list-item><para> ..... </para></list-item>
</list>
</para>
</section>
*<split/>*
<section> <!--dummy tag-->
<para><!--dummy tag-->
<list><!--dummy tag-->
<list-item><para><!--dummy tag--> <split/> .... </para></list-item>
</list>
The para sample lines.. </para>
</section>
Note: Asterisks for just identification purpose only (need to delete the tag)
I am very new in using Module based on the Markup Languages. Could someone help me to get the idea. (I am also trying...)
Upvotes: 1
Views: 67
Reputation: 53478
Here's an example of how you could do this using XML::Twig
:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $first_doc = XML::Twig -> parse ( \*DATA );
my $second_doc = XML::Twig -> new;
$second_doc -> set_root ( $first_doc -> root -> copy ); #create a copy.
while ( my $after_split = $first_doc -> get_xpath('//split',0)->next_sibling ) {
$after_split -> delete;
}
$first_doc -> get_xpath('//split',0) -> delete; # delete split tag.
while ( my $before_split = $second_doc -> get_xpath('//split',0)->prev_sibling ) {
$before_split -> delete;
}
$second_doc -> get_xpath('//split',0) -> delete; # delete split tag.
$first_doc -> set_pretty_print ('indented_a');
$first_doc -> print;
print "\n--- second doc ---\n";
$second_doc -> set_pretty_print ('indented_a');
$second_doc -> print;
__DATA__
<section>
<para>
<list>
<list-item><para> sample content for first doc <split/> second doc sample content </para></list-item>
</list>
</para>
</section>
This gives you as output:
<section>
<para>
<list>
<list-item>
<para> sample content for first doc </para>
</list-item>
</list>
</para>
</section>
--- second doc ---
<section>
<para>
<list>
<list-item>
<para> second doc sample content </para>
</list-item>
</list>
</para>
</section>
You will probably want to look at parsefile
and sprint
from XML::Twig
to handle reading your own file, and generating output.
Note - this does a 'full split' of the document into essentially two separate documents - but this technique should work withing a subtree, because the core of it is locating your split
element, and deleting everything before or after it as necessary.
Upvotes: 2