Graham Brigden
Graham Brigden

Reputation: 41

Use tr with arrays

I am trying to use the tr function specifying two arrays, as the to and from sets. the translation doesn't seem to be working, or i am not understanding it correctly. I am new to perl, so please let me know if i am doing something wrong

I load the arrays as (I know this part works):

open my $fh,'<',"${main_dir}/char_convert" or die "Cannot open allowed conversion file";
my @from_set;
my @to_set;
my @conversion;
while (my $lines = <$fh>) {
  @conversion = split(" ",$lines);
  push @from_set,$conversion[0];
  push @to_set,$conversion[1];
}

#The variable $line holds the data I want converted:
my $statement;
my $result;
$statement = "tr\@from_set\@to_set\$line;"; # Setup the tr command
$result = eval($statement); # perform the conversion
print "$line\n";

the result is the same as the data going in. No conversion appears to have taken place. What am I doing wrong?

An example part of the data is "PICAÑA". The line in the conversion file is "Ñ N" So I expect to get out "PICANA", but I get the original data

Thanks for looking

Upvotes: 3

Views: 280

Answers (4)

ikegami
ikegami

Reputation: 386396

I'm assuming you went with tr/// because it's faster than s///. If so, using eval each time you do a translation defies the purpose. The only way it's going to be faster if is you use eval once, but perform multiple transliterations.

In addition to making it possible to use the compiled tr/// multiple times, the following fixes the Perl syntax errors as well as the code injection bugs:

my $from_set = join '', @from_set;
my $to_set   = join '', @to_set;

my $tr = eval("sub { \$_[0] =~ tr/\Q$from_set\E/\Q$to_set\E/r }")
   or die($@);

my $output = $tr->($input);

If, on the other hand, you're only performing the transliteration once, then you're making your life more complicated and slowing down your program for nothing by using tr///. Use s/// instead.

my %map; @map{@from_set} = @to_set;
my $from_set = join '', @from_set;
my $re = qr/([\Q$from_set\E])/;

my $output = $input =~ s/$re/$map{$1}/gr;

Upvotes: 5

vlumi
vlumi

Reputation: 1319

Your $statement is a bit off, as the normal form would be $line =~ tr/a/b/, right? So should be like this:

my $statement = "\$line =~ tr/\Q@from_set\E/\Q@to_set\E/;"

The $line should remain a variable during evaluation, so it is escaped as \$line. The contents of the @from_set and @to_set should be interpolated into $statement, so they are given without \.

Upvotes: 3

sergiotarxz
sergiotarxz

Reputation: 540

From Perl Mongers if you want safety against injection of slashes you should use quotemeta like this or use the @ikegami solution:

eval sprintf "tr/%s/%s/", map quotemeta, $oldlist, $newlist;

https://www.perlmonks.org/?node_id=445971

Upvotes: 2

Dave Cross
Dave Cross

Reputation: 69314

There are a few problems here. They are mainly around the syntax of your tr/../../ statement. It should be like this:

$line =~ tr/CHARS/CHARS/;

You have the $line in the wrong place and you're using backslashes instead of forward slashes (you can use forward slashes as the delimiter in a tr/.../.../ statement, but remember that they have a special meaning in double-quoted strings).

This seems to do what you want (I've switched to using the internal DATA filehandle for ease of testing.

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';
use utf8;

my @from;
my @to;

while (<DATA>) {
  chomp;
  my @conv = split;
  push @from, $conv[0];
  push @to,   $conv[1];
}

my $line = 'PICAÑA';

my $statement = "\$line =~ tr/@from/@to/";

eval $statement;

say $line;

__DATA__
Ñ N
Ê E

I don't, obviously, know exactly which characters you're dealing with here but it looks like you might find Text::Unidecode useful.

Update: it's also worth pointing out that the tr/.../.../ statement still isn't quite right (although it works). If you print $statement, you'll see it gives:

$line =~ tr/Ñ Ê/N E/

That extra space comes from the fact that Perl puts a space between array elements when there are interpolated in a double-quoted string. If you cared, you could fix that by setting $" to an empty string.

Update 2:

Having thought about it a little more, I think I wouldn't use arrays at all. Why not use scalars instead?

my $from = '';
my $to   = '';

# And then, in the loop...

$from .= $conv[0];
$to   .= $conv[1];

# And later still...

my $statement = "\$line =~ tr/$from/$to/";

Upvotes: 2

Related Questions