JBMOS
JBMOS

Reputation: 1

How to modify an XML file using Perl and XML::Twig

I tried to modify the name field in an XML file using this program

use XML::Twig;

open(OUT, ">resutl.xml") or die "cannot open out file main_file:$!";

my $twig = XML::Twig->new(
  pretty_print  => 'indented',
  twig_handlers => {
    association => sub {
      $_->findnodes('div');
      $_->set_att(name => 'xxx');
    },
  },
);

$twig->parsefile('in.xml');

$twig->flush(\*OUT);

 

<div
name="test1" 
booktype="book1"
price="e200" 
/>
<div
name="test2" 
booktype="book2"
price="100" />

When I execute the Perl script it prints the error

junk after document element at line 6, column 0, byte 65 at C:/Perl64/lib/XML/Parser.pm line 187.
at C:\Users\admin\Desktop\parse.pl line 14.

Upvotes: 0

Views: 614

Answers (2)

Miller
Miller

Reputation: 35208

Properly formatted xml requires a single root element. When XML::Twig attempts to parse your file, it finds the first div and decides that is the root element of the file. When it reaches the end of that and finds another tag at line 6, it gets unhappy and rightfully says there's an error.

If this document is actually intended to be XML, you'll need to enclose that data in fake element in order for it to be parsable. The following does that:

use strict;
use warnings;

use XML::Twig;

my $data = do {local $/; <DATA>};

# Enclose $data in a fake <root> element
$data = qq{<root>$data</root>};

my $twig = XML::Twig->new(
  pretty_print  => 'indented',
  twig_handlers => {
    association => sub {
      $_->findnodes('div');
      $_->set_att(name => 'xxx');
    },
  },
);

$twig->parse($data);

$twig->print;

__DATA__
<div
name="test1" 
booktype="book1"
price="e200" 
/>
<div
name="test2" 
booktype="book2"
price="100" />

Outputs:

<root>
  <div booktype="book1" name="test1" price="e200"/>
  <div booktype="book2" name="test2" price="100"/>
</root>

Now, it's also unclear what you're trying to do with your "XML". I suspect you're trying to change the name attributes of the div tags to be 'xxx'. If that's the case then you need to redo your twig_handlers to the following:

  twig_handlers => {
    '//div' => sub { $_->set_att(name => 'xxx'); },
  },

The output will then be:

<root>
  <div booktype="book1" name="xxx" price="e200"/>
  <div booktype="book2" name="xxx" price="100"/>
</root>

Upvotes: 1

Borodin
Borodin

Reputation: 126762

I have tried to tidy your post a little but I don't understand the XML fragment that immediately follows the Perl code.

There are two empty div elements without a root element, so as it stands it isn't well-formed XML.

XML::Twig is assuming that the first div element is the document (root) element and, since it has no content, the subsequent text produces the error message

junk after document element

You also have set twig_handlers to just a single element that handles association elements in the XML, but your data has no such elements.

I think you need to explain more about what it is that you need to do

Upvotes: 1

Related Questions