moiamserru
moiamserru

Reputation: 81

Trying to input variable into url and having encoding issues

I am new to Perl and trying to make a script that takes input from the user and then get XML data from a website based on that input together with a url and then relay it back to the user.

But I have had some issues now with make a usable link based on the input from the user.

This is my code in full:

use strict;
use warnings;

my $row = 0;

use XML::LibXML;

print "\n\n\nOn what place do you need a weather report for? -> ";

chomp( my $ort = <> );

my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");

my $dom = XML::LibXML->load_xml(location => $url);

print "\n\nSee below the weather for ", $ort, ":\n\n";

foreach my $weatherdata ($dom->findnodes('//time')) {

    if($row != 10){ 

        my $temp = $weatherdata->findvalue('./temperature/@value');
        my $value = $weatherdata->findvalue('./@from');

        my $valuesub = substr $value, 11, 5;

        print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";

        $row++;
    }
}

print "\n\n";

If I write a place I want the weather info on. For example:

Mellerud

Then it takes that and I get a response from the link with propper data. However. If I Write

Åmål

Its not making any sense to the script. I now get:

Could not create file parser context for file "http://www.yr.no/place/Sweden/V├ñstra_G├Âtaland/Åmål/forecast_hour_by_hour.xml": No error at test4.pl line 14

If I replace ",$ort," and just add Åmål I get the propper result. I have been searching for different types of encoding for this, but I have not found a solution that works.

Once again I would like to point out that I am really new to this. I might miss something really simple. My apologies for that.

::EDIT 1::

After suggestion from @zdim I added use open ':std', ':encoding(UTF-8)';

This added some different results, but does only generate more error as following here:

enter image description here

Also I am running this in Windows CMD under administrator privileges. According to @zdim its running fine in linux with xterm for input, v5.16. Is there a way to make it work in Windows?

Upvotes: 1

Views: 109

Answers (1)

Silvar
Silvar

Reputation: 705

The problem is that CMD.exe is limited to 8-bit codepages. The "Å" and "å" characters are mapped (in Swedish Windows) to positions in the upper 8-bit range of codepage 850 that are illegal code points in Unicode.

If you need to output non-7-bit-ASCII characters, consider running PowerShell ISE. If you set it up correctly, it can cope with any character (in output) that the font you're using supports. The big downside is that PowerShell ISE is not a console, and therefore doesn't allow input from console/keyboard using STDIN. You can work around this by supplying your input as arguments, from a pipe, in a setting file, or thru graphical UI query elements.

To set up Windows PowerShell ISE to work with UTF8:

  1. Set PowerShell to allow running local unsigned user scripts by running (in administrator elevated PowerShell):

    Set-ExecutionPolicy RemoteSigned
    
  2. Create or edit the file "<Documents>\WindowsPowerShell\Microsoft.PowerShellISE_profile.ps1" and add something like:

    perl -w -e 'print qq!Initializing with Perl...\n!;'
    [System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8;
    

    (You need the Perl bit (or something equivalent) there to allow for the modification of the encoding.)

  3. In PowerShell ISE's options, set the font to Consolas.

  4. In your perl scripts, always do:

    binmode(STDOUT, ':encoding(UTF-8)');
    binmode(STDERR, ':encoding(UTF-8)');
    

My solution to the OP's problem:

use strict;
use warnings;

my $row = 0;

use XML::LibXML;

binmode(STDOUT, ':encoding(UTF-8)');
binmode(STDERR, ':encoding(UTF-8)');

@ARGV  or  die "No arguments!\n";

my $ort = shift @ARGV;

print "\n\n\nGetting weather report for \"$ort\"\n";

my $url = join('', "http://www.yr.no/place/Sweden/Västra_Götaland/",$ort,"/forecast_hour_by_hour.xml");

my $dom = XML::LibXML->load_xml(location => $url);

print "\n\nSee below the weather for ", $ort, ":\n\n";

foreach my $weatherdata ($dom->findnodes('//time')) {

    if($row != 10){ 

        my $temp = $weatherdata->findvalue('./temperature/@value');
        my $value = $weatherdata->findvalue('./@from');

        my $valuesub = substr $value, 11, 5;

        print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";

        $row++;
    }
}

print "\n\n";

Output:

(run at around 2018-06-09T14:05 UTC; 16:05 CEST (which is Sweden's time zone)):

PS (censored)> perl -w $env:perl5lib\Tests\Amal-Test.pl "Åmål"



Getting weather report for "Åmål"


See below the weather for Åmål:

At 17:00 the temperature will be: 27C
At 18:00 the temperature will be: 26C
At 19:00 the temperature will be: 25C
At 20:00 the temperature will be: 23C
At 21:00 the temperature will be: 22C
At 22:00 the temperature will be: 21C
At 23:00 the temperature will be: 20C
At 00:00 the temperature will be: 19C
At 01:00 the temperature will be: 18C
At 02:00 the temperature will be: 17C

Another note:

Relying on data to always be in an exact position in a string might not be the best idea.

Instead of:

my $valuesub = substr $value, 11, 5;

maybe consider matching it with a regular expression instead:

if ($value =~ /T((?:[01]\d|2[0-3]):[0-5]\d):/) {
    my $valuesub = $1;
    print "At ", $valuesub, " the temperature will be: ", $temp, "C\n";    }
else {
    warn "Malformed value: $value\n";
}

Upvotes: 3

Related Questions