Reputation: 185
i am having a XML file which i need to transfer to a list with Perl (without using XSLT).
This is my (simplyfied, removed like 10 more attributes to make it easier to read!) XML:
...
<XMLTAG ID="1" name="NAME1" status="0" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
<XMLTAG ID="2" name="NAME2" status="1" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
<XMLTAG ID="3" name="NAME3" status="0" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
...
What i got so far:
my $input = in.xml;
my $output = out.txt;
# open input
open( INPUT, $input )
|| die "Can't find $input: $_";
# open output
open( OUTPUT, ">$output" )
|| die "Can't find $output: $_";
# run until perl returns undef (at the end of the file)
while (<INPUT>) {
if ($_ == /date1=\"[0-3]?[0-9].[0-3]?[0-9].(?:[0-9]{2})?[0-9]{2} [0-5][0-9]:[0-5][0-9]:[0-5][0-9]\"/) {
print OUTPUT $_;};
}
close(INPUT);
close(OUTPUT);
The output file should look like this:
date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:01"
date1="24.05.2012 13:37:02"
...
Thanks in advance, Marley
Upvotes: 2
Views: 383
Reputation: 3486
try:
date1=\"(.*?)\"
for your regex, it will make a non greedy search.
UPDATE:
they warn me that there is no need for escaping double quotes, so
date1="(.*?)"
will do.
Upvotes: 0
Reputation: 571
You might use a non-greedy match, like this:
if ($_ =~ /(date1=".*?")/ ) {
print OUTPUT "$1\n";
}
Upvotes: 0
Reputation: 126742
You should use a proper XML parsing module. There are many available, but here is a solution using XML::Smart
.
It's not a solution I would choose, but I would be interested to know why you have written off XSLT?
use strict;
use warnings;
use XML::Smart;
my $input = 'in.xml';
my $output = 'out.txt';
open my $out, '>', $output or die qq(Can't open output file "$output": $!);
my $xml = XML::Smart->new($input);
my $text = $xml->{root}{XMLTAG};
my $xmltags = $xml->{root}{XMLTAG};
for my $tag (@$xmltags) {
print $out qq(date1="$tag->{date1}"\n);
}
output
date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:00"
Upvotes: 1
Reputation: 39158
use XML::LibXML qw();
my $dom = XML::LibXML->load_xml(location => 'in.xml');
printf qq(date1="%s"\n), $_->getAttribute('date1')
for $dom->findnodes('//XMLTAG');
Upvotes: 6