Badboy
Badboy

Reputation: 21

XML::Twig - Search and Add a line

I am new to Perl and this is my first Perl program, I have a XML file that I need to edit with some type of automation:

<Host appBase="webapps"
        unpackWARs="true"
        autoDeploy="true"
        deployOnStartup="true"
        deployXML="true"
        name="localhost"
        xmlValidation="false"
        xmlNamespaceAware="false">
</Host>

The goal is to search the XML file for this stanza and take the user input and add it after the xmlNamespaceAware=false but before the closing tag to get this output that adds the <Alias> tag as a child:

<Host appBase="webapps"
            unpackWARs="true"
            autoDeploy="true"
            deployOnStartup="true"
            deployXML="true"
            name="localhost"
            xmlValidation="false"
            xmlNamespaceAware="false">
            <Alias>HOST.com</Alias>
</Host>

Upvotes: 1

Views: 476

Answers (2)

mirod
mirod

Reputation: 16171

It is not an easy problem. By wanting to keep the attribute order, and maybe, you don't specify it, the rest of the formatting, you are not treating XML really as XML. Most XML parser don't give you the kind of detail about the data you would need to do what you want.

Software that processes XML should not care about attribute order, or non-significant white space. So adding the attribute with XML::Twig or any other way should be straightforward.

But by wanting to keep the exact same attribute order, you are imposing a constraint on your code that changes it quite radically. You're leaving the domain of XML and treating the data as pure text. Which could be fine and not such a big deal, it may be that you just need to write a simple parser for it, one that gives you access to the original formatting. Except that the input is probably specified as "XML" and it may change in the future in ways that would break your code but not an XML parser.

OK, now that this is out of the way, XML::Twig actually does let you keep the attribute order ;--), using the keep_atts_order option when you create the twig. So that's easy.

Keeping the formatting is a bit more tricky though. In your case, and for the limited sample of data you gave, I can get it to work by sub-classing the method that returns the start tag for an element. Getting it to work generically would be a lot more complex though.

So here is a framework that you could use

#!/usr/bin/perl

use strict;
use warnings;
use Test::More;

use XML::Twig;

# get the input and the expected result
my( $in, $expected)= do { $/="\n\n"; <DATA>};
chomp $in; chomp $expected;

my $xna= 'false'; # represents the user inpput

my $t= XML::Twig->new( twig_handlers => { Host => sub { $_->set_att( xmlNamespaceAware => $xna); } 
                                        },
                       keep_atts_order => 1,            # the bit you were looking for
                       elt_class => 'XML::Twig::MyElt', # to use the element sub-class 
              )
                ->parse( $in);

is( $t->sprint, $expected, 'one test for now');

done_testing();


package XML::Twig::MyElt;

use XML::Twig;
use base 'XML::Twig::Elt';

sub start_tag
  { my( $elt)= @_;
    if( $elt->tag ne 'Host')
      { return $elt->SUPER::start_tag }
    else
      { return '<' . $elt->tag . ' '
              . join( "\n            ", 
                       map { qq{$_="} . $elt->att( $_) . qq{"} } 
                         keys %{$elt->atts}      # the keys are in the right order
                    )
              . '>';
      }
  }

package main;

__DATA__
<Host appBase="webapps"
            unpackWARs="true"
            autoDeploy="true"
            deployOnStartup="true"
            deployXML="true"
            name="localhost"
            xmlValidation="false">
            **<Alias>HOST.com</Alias>**
</Host>

<Host appBase="webapps"
            unpackWARs="true"
            autoDeploy="true"
            deployOnStartup="true"
            deployXML="true"
            name="localhost"
            xmlValidation="false"
            xmlNamespaceAware="false">
            **<Alias>HOST.com</Alias>**
</Host>

But really, keeping the format intact is madness. Or fun if you like this kind of challenge ;--)

Upvotes: 3

Andrew Edvalson
Andrew Edvalson

Reputation: 7878

I haven't used XML::Twig - I've used XML::Simple though. If it's imperative that the attributes stay in order, you might have to just stick with string processing.

use XML::Simple;

my $xml = '<Host appBase="webapps" unpackWARs="true"  autoDeploy="true" deployOnStartup="true" deployXML="true" name="localhost" xmlValidation="false" xmlNamespaceAware="false"></Host>';
my $ref = XMLin($xml);
$ref->{Alias} = { content => 'User Input' };
my $newxml = XMLout($ref, RootName => 'Host');
print $newxml;

Upvotes: 0

Related Questions