How can I unescape / urldecode query URL query variables in perl?

Question

I have a perl script that processes URLs - it was working fine until the server got upgraded. Now it seems to be double encoding the URL string that it returns.

Here is an example of the URL the script used to return:

https://processor.com/?&streetOne=johndoe%40test%2Ecom&key=1234

and here is what it is returning now

https://processor.com/?&email=johndoe%2540test%252Ecom&key=1234

They key difference is that the @ in the URL used to be encoded correctly as %40 but now the percent sign is getting encoded (double encoding) so it is %2540 which is what happens if you encode the @ and then encode it again.

I'm not sure what would have changed on the server to cause this behavior. I'm a PHP guy and it reminds me of the "magic quotes" where it would auto-escape all of the query variables to received before the script processed it.

I don't have root access to this server, so if it is not possible to make a change with an .htaccess file or some local configuration option (in the perl script?) then I might need to change what is happening in this function, which gets each of the URL values requested:

First the script reads the Query Variables

(I think this is where the problem could be - I don't understand the regex - something about looking for KEY between white space, and what the $obj->{'key'}; part is doing)

sub getValue
{
  my $obj = shift;
  my $name = shift;

  if ($name =~ /^\s*KEY\s*$/i)
    {
      return $obj->{'key'};
    }

  if (! $obj->isValid($name))
    {
      $obj->addError("Cannot obtain information for field '$name' since field is invalid");
      return 0;
    }

  if (! $obj->isAssigned($name))
    {
      $obj->addError("Cannot obtain value from field '$name' since field has not be assigned a value.");
      return 0;
    }

  return $obj->{'parameters'}->{ $name }->{ 'value'};
}

Then script builds the query variables back into a URL to return:

There is another part of the script that builds the URL that it returns - but I don't think this is the culprit. I tried removing the CGI::escape from the CGI::escape($value)) part and it did not help.

sub create_results
{
  my $obj = shift;

  my ($seconds, $microseconds) = gettimeofday();
  my $timestamp = int($seconds*1000 + $microseconds/1000);

  $obj->assign('timestamp',$timestamp);

  # create query string and hash data

  my $hash_data = '';

  my @query_string = qw();

  foreach my $name (@{ $obj->{'parameter_order'} })
    {
      my $node = $obj->{'parameters'}->{ $name };

      if (defined($node->{'value'}))
        {
          my $value = $node->{'value'};
          $hash_data .= $value;

          # $query->param(-name=>"$name",   -value=>"$value");
          push(@query_string,$name . "=" . CGI::escape($value));
        }
    }

  # Hash
  $hash_data .= $obj->get('key');
  my $hash_digest = md5_hex( $hash_data );

  push(@query_string, "hash=$hash_digest");

  $obj->{'query_string'} = join("&",@query_string);
  $obj->{'hash_digest'} = $hash_digest; 
}

The script is a perl package that I'm using. I didn't write it. I posted the full script here: http://pastebin.com/eZr8rQ0t

amon · Accepted Answer

Using the CGI module is a bit dated, and CGI::escape is an undocumented, internal function inside CGI::Util, which is meant for CGI internals only. There is a corresponding unescape function available, but this is hardly The Right Thing to do.

To cut the chase short, using CGI::Util::unescape($dirty_value) somewhere should work, as that module should aready be loaded by CGI. Maybe at the return from getValue, but I am to tired to find the correct place. This double-encoding seems like a design error, or the example URL you gave is assumed to be escaped already, and the script is just used wrong.

I'd bet on wrong design; considering that somebody thought it was fun to do

"foo" => { "parameterName" => "foo" },
... # snip like 50 other values
"quux" => { "parameterName" => "quux" },

and other dumbness where map { $_ => {parameterName => $_} } qw/foo ... qux/ would have worked with a fraction of typing …

Hint: Documentation for all (public) Perl modules (incl. source code) is available at https://metacpan.org/.

How can I unescape / urldecode query URL query variables in perl?

First the script reads the Query Variables

Then script builds the query variables back into a URL to return:

Answers (1)

Related Questions