Reputation: 41
I am trying to use the tr function specifying two arrays, as the to and from sets. the translation doesn't seem to be working, or i am not understanding it correctly. I am new to perl, so please let me know if i am doing something wrong
open my $fh,'<',"${main_dir}/char_convert" or die "Cannot open allowed conversion file";
my @from_set;
my @to_set;
my @conversion;
while (my $lines = <$fh>) {
@conversion = split(" ",$lines);
push @from_set,$conversion[0];
push @to_set,$conversion[1];
}
#The variable $line holds the data I want converted:
my $statement;
my $result;
$statement = "tr\@from_set\@to_set\$line;"; # Setup the tr command
$result = eval($statement); # perform the conversion
print "$line\n";
the result is the same as the data going in. No conversion appears to have taken place. What am I doing wrong?
An example part of the data is "PICAÑA". The line in the conversion file is "Ñ N" So I expect to get out "PICANA", but I get the original data
Thanks for looking
Upvotes: 3
Views: 280
Reputation: 386396
I'm assuming you went with tr///
because it's faster than s///
. If so, using eval
each time you do a translation defies the purpose. The only way it's going to be faster if is you use eval
once, but perform multiple transliterations.
In addition to making it possible to use the compiled tr///
multiple times, the following fixes the Perl syntax errors as well as the code injection bugs:
my $from_set = join '', @from_set;
my $to_set = join '', @to_set;
my $tr = eval("sub { \$_[0] =~ tr/\Q$from_set\E/\Q$to_set\E/r }")
or die($@);
my $output = $tr->($input);
If, on the other hand, you're only performing the transliteration once, then you're making your life more complicated and slowing down your program for nothing by using tr///
. Use s///
instead.
my %map; @map{@from_set} = @to_set;
my $from_set = join '', @from_set;
my $re = qr/([\Q$from_set\E])/;
my $output = $input =~ s/$re/$map{$1}/gr;
Upvotes: 5
Reputation: 1319
Your $statement
is a bit off, as the normal form would be $line =~ tr/a/b/
, right? So should be like this:
my $statement = "\$line =~ tr/\Q@from_set\E/\Q@to_set\E/;"
The $line
should remain a variable during evaluation, so it is escaped as \$line
. The contents of the @from_set
and @to_set
should be interpolated into $statement
, so they are given without \
.
Upvotes: 3
Reputation: 540
From Perl Mongers if you want safety against injection of slashes you should use quotemeta like this or use the @ikegami solution:
eval sprintf "tr/%s/%s/", map quotemeta, $oldlist, $newlist;
https://www.perlmonks.org/?node_id=445971
Upvotes: 2
Reputation: 69314
There are a few problems here. They are mainly around the syntax of your tr/../../
statement. It should be like this:
$line =~ tr/CHARS/CHARS/;
You have the $line
in the wrong place and you're using backslashes instead of forward slashes (you can use forward slashes as the delimiter in a tr/.../.../
statement, but remember that they have a special meaning in double-quoted strings).
This seems to do what you want (I've switched to using the internal DATA
filehandle for ease of testing.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use utf8;
my @from;
my @to;
while (<DATA>) {
chomp;
my @conv = split;
push @from, $conv[0];
push @to, $conv[1];
}
my $line = 'PICAÑA';
my $statement = "\$line =~ tr/@from/@to/";
eval $statement;
say $line;
__DATA__
Ñ N
Ê E
I don't, obviously, know exactly which characters you're dealing with here but it looks like you might find Text::Unidecode useful.
Update: it's also worth pointing out that the tr/.../.../
statement still isn't quite right (although it works). If you print $statement
, you'll see it gives:
$line =~ tr/Ñ Ê/N E/
That extra space comes from the fact that Perl puts a space between array elements when there are interpolated in a double-quoted string. If you cared, you could fix that by setting $"
to an empty string.
Update 2:
Having thought about it a little more, I think I wouldn't use arrays at all. Why not use scalars instead?
my $from = '';
my $to = '';
# And then, in the loop...
$from .= $conv[0];
$to .= $conv[1];
# And later still...
my $statement = "\$line =~ tr/$from/$to/";
Upvotes: 2