Marley
Marley

Reputation: 185

Perl write String from XML to File using difficult Regex

i am having a XML file which i need to transfer to a list with Perl (without using XSLT).

This is my (simplyfied, removed like 10 more attributes to make it easier to read!) XML:

...
<XMLTAG ID="1" name="NAME1" status="0" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
<XMLTAG ID="2" name="NAME2" status="1" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
<XMLTAG ID="3" name="NAME3" status="0" date1="24.05.2012 13:37:00" date2="25.05.2012 13:37:00" />
...

What i got so far:

my $input = in.xml;
my $output = out.txt;

# open input
open( INPUT, $input )
  || die "Can't find $input: $_";

# open output
open( OUTPUT, ">$output" )
  || die "Can't find $output: $_";

    # run until perl returns undef (at the end of the file)
    while (<INPUT>) {
        if ($_ == /date1=\"[0-3]?[0-9].[0-3]?[0-9].(?:[0-9]{2})?[0-9]{2} [0-5][0-9]:[0-5][0-9]:[0-5][0-9]\"/) {
        print OUTPUT $_;};
    }
    close(INPUT);
    close(OUTPUT);

The output file should look like this:

date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:01"
date1="24.05.2012 13:37:02"
...

Thanks in advance, Marley

Upvotes: 2

Views: 383

Answers (5)

try:

date1=\"(.*?)\"

for your regex, it will make a non greedy search.

UPDATE:

they warn me that there is no need for escaping double quotes, so

date1="(.*?)"

will do.

Upvotes: 0

Pygmalion
Pygmalion

Reputation: 571

You might use a non-greedy match, like this:

if ($_ =~ /(date1=".*?")/ ) {
       print OUTPUT "$1\n";
    }

Upvotes: 0

choroba
choroba

Reputation: 241968

Using XML::XSH2:

open in.xml ;
ls //@date1 ;

Upvotes: 1

Borodin
Borodin

Reputation: 126742

You should use a proper XML parsing module. There are many available, but here is a solution using XML::Smart.

It's not a solution I would choose, but I would be interested to know why you have written off XSLT?

use strict;
use warnings;

use XML::Smart;

my $input = 'in.xml';
my $output = 'out.txt';

open my $out, '>', $output or die qq(Can't open output file "$output": $!);

my $xml = XML::Smart->new($input);
my $text = $xml->{root}{XMLTAG};

my $xmltags = $xml->{root}{XMLTAG};

for my $tag (@$xmltags) {
  print $out qq(date1="$tag->{date1}"\n);
}

output

date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:00"
date1="24.05.2012 13:37:00"

Upvotes: 1

daxim
daxim

Reputation: 39158

use XML::LibXML qw();
my $dom = XML::LibXML->load_xml(location => 'in.xml');
printf qq(date1="%s"\n), $_->getAttribute('date1')
    for $dom->findnodes('//XMLTAG');

Upvotes: 6

Related Questions