Reputation: 143
=XML FILE=
<?xml version="1.0" encoding="utf-8"?>
<weatherdata>
<location>
<name>Toronto</name>
<type/>
<country>CA</country>
<timezone/>
<location altitude="0" latitude="43.700111" longitude="-79.416298" geobase="geonames" geobaseid="0"/></location>
<credit/>
<meta>
<lastupdate/>
<calctime>1.4906</calctime>
<nextupdate/>
</meta>
<sun rise="2015-02-17T12:12:32" set="2015-02-17T22:50:42"/>
<forecast>
<time from="2015-02-17T15:00:00" to="2015-02-17T18:00:00">
<symbol number="803" name="broken clouds" var="04d"/>
<precipitation/>
<windDirection deg="43.5048" code="NE" name="NorthEast"/>
<windSpeed mps="1.82" name="Light breeze"/>
<temperature unit="celsius" value="-13.29" min="-13.293" max="-13.29"/>
<pressure unit="hPa" value="1007.77"/>
<humidity value="100" unit="%"/>
<clouds value="broken clouds" all="64" unit="%"/>
</time>
<time from="2015-02-17T18:00:00" to="2015-02-17T21:00:00">
<symbol number="803" name="broken clouds" var="04d"/>
<precipitation/>
<windDirection deg="255.501" code="WSW" name="West-southwest"/>
<windSpeed mps="0.66" name="Calm"/>
<temperature unit="celsius" value="-10.16" min="-10.16" max="-10.16"/>
<pressure unit="hPa" value="1006.44"/>
<humidity value="100" unit="%"/>
<clouds value="broken clouds" all="80" unit="%"/>
</time>
= DUMPER EXTRACT =
'att' => {
'to' => '2015-02-22T00:00:00',
'from' => '2015-02-21T21:00:00'
'att' => {
'value' => '100',
'unit' => '%'
'next_sibling' => $VAR1->{'twig_root'}{'first_child'}{'next_sibling' } {'next_sibling'}{'next_sibling'}{'next_sibling'}{'last_child'}{'prev_sibling'} {'last_child'}{'prev_sibling'},
'att' => {
'unit' => 'hPa',
'value' => '1020.87'
'prev_sibling' => bless( {
'att' => {
'min' => '-8.313',
'max' => '-8.313',
'unit' => 'celsius',
I am looking to extract from the XML file:
'from' (only the time) 'humidity value' (the value) 'temperature max' (the temp value) 'temperature min' (the temp value) 'pressure value' (the hpA value)
The code below was my draft code to see if I was on the right track. The intention was to get it working with a few nodes; outputting it to a CSV file. I am not getting anywhere...
= PERL CODE =
use strict;
use Data::Dumper;
use XML::Simple 'XMLin';
my $input_xml = '/var/egridmanage_pl/data/longrange.xml' ||die $!;
my $output_csv = '/var/egridmanage_pl/longrange.csv';
my $parse = XMLin('/var/egridmanage_pl/data/longrange.xml',forcearray => ['value']);
foreach my $dataset (@{$parse->{weatherdata}}) {
if ($dataset->{name} eq 'Toronto') {
open my $out, ">", $output_csv or die "Could not open $output_csv: $!";
print {$out} $dataset->{att}-> {from} . "\n";
print {$out} $dataset->{att}->[0]->{value} . "\n";
}
}
= INTENDED RESULTS WOULD BE THE FOLLOWING = (I NEED HELP!!)
time | humidity | hPa | min | max |
15:00:00 | 100 | 1007.77 | -13.29 | -13.29 |
Upvotes: 1
Views: 398
Reputation: 53478
Whilst you have an answer, you've tagged it as Perl, so I'll contribute something perlish. First off though - don't use XML::Simple
. From it's docs:
The use of this module in new code is discouraged. Other modules are available which provide more straightforward and consistent interfaces.
Personally, I like XML::Twig
when it comes to XML parsing. It goes something like this: (note - I've cut down your XML, because yours is incomplete and thus invalid. But that shouldn't matter, because this code is only inspecting the <time>
elements)
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
sub time_handler {
my ( $twig, $time ) = @_;
print join( "\t",
$time->att('from'),
$time->first_child('humidity')->att('value'),
$time->first_child('pressure')->att('value'),
$time->first_child('temperature')->att('min'),
$time->first_child('temperature')->att('max'),
"\n" );
#will discard data as you go, saving memory footprint.
$twig -> purge;
}
local $/;
my $parser = XML::Twig->new( twig_handlers => { 'time' => \&time_handler } )
->parse(<DATA>);
__DATA__
<?xml version="1.0" encoding="utf-8"?>
<weatherdata>
<time from="2015-02-17T15:00:00" to="2015-02-17T18:00:00">
<symbol number="803" name="broken clouds" var="04d"/>
<precipitation/>
<windDirection deg="43.5048" code="NE" name="NorthEast"/>
<windSpeed mps="1.82" name="Light breeze"/>
<temperature unit="celsius" value="-13.29" min="-13.293" max="-13.29"/>
<pressure unit="hPa" value="1007.77"/>
<humidity value="100" unit="%"/>
<clouds value="broken clouds" all="64" unit="%"/>
</time>
<time from="2015-02-17T18:00:00" to="2015-02-17T21:00:00">
<symbol number="803" name="broken clouds" var="04d"/>
<precipitation/>
<windDirection deg="255.501" code="WSW" name="West-southwest"/>
<windSpeed mps="0.66" name="Calm"/>
<temperature unit="celsius" value="-10.16" min="-10.16" max="-10.16"/>
<pressure unit="hPa" value="1006.44"/>
<humidity value="100" unit="%"/>
<clouds value="broken clouds" all="80" unit="%"/>
</time>
</weatherdata>
Upvotes: 2
Reputation: 22617
Let me suggest something radically different. Since your input is an XML document, you could use XSLT to extract data from it.
Your Perl code would then consist of executing this transformation, everything else would be handled in an XSLT stylesheet. You'd have to use a library that includes an XSLT processor, and in my opinion, using LibXML and LibXSLT would be the safest way (sample code taken from here):
use XML::LibXSLT;
use XML::LibXML;
my $xslt = XML::LibXSLT->new();
my $source = XML::LibXML->load_xml(location => 'foo.xml');
my $style_doc = XML::LibXML->load_xml(location=>'bar.xsl', no_cdata=>1);
my $stylesheet = $xslt->parse_stylesheet($style_doc);
my $results = $stylesheet->transform($source);
print $stylesheet->output_as_bytes($results);
Assuming a well-formed input XML, use the following transformation.
XSLT Stylesheet
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text" encoding="UTF-8" />
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:text>time|humidity|hPa|min|max|
</xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="time">
<xsl:value-of select="concat(@from,'|',humidity/@value, '|', pressure/@value,'|', temperature/@min, '|', temperature/@max, '|')"/>
<xsl:if test="following::time">
<xsl:text>
</xsl:text>
</xsl:if>
</xsl:template>
<xsl:template match="text()"/>
</xsl:transform>
Text Output
time|humidity|hPa|min|max|
2015-02-17T15:00:00|100|1007.77|-13.293|-13.29|
2015-02-17T18:00:00|100|1006.44|-10.16|-10.16|
Upvotes: 4