Reputation: 63
So I'm trying to figure out how to substitute the numbers in specific columns and replace that number with a letter. I'm coding in Perl
input.txt
10004226549870
204226549870062001186000000000040008060802
5032000067470318021604226549870062001186
603200100001312293000522105000000456000131289
603200200006545553000522109000004242000654555
603200300002463923000522090000005571000246392
603200400002635413000521196000248920000263541
60320050000175960300052196600000101700017596
603200600001054853004867190000001003000105485
603200700001451223000522095000003981000145122
75030000674703180216000700017222840007
89999990674703180216000070001722284
9000013
Substitution Table
1 -> A
2 -> B
3 -> C
4 -> D
5 -> E
6 -> F
7 -> G
8 -> H
9 -> I
0 -> {
Substitution should occur under the following conditions:
Line starts with a "6", for the numbers located in column 17 and column 45.
Line starts with a "7", for the number located in column 34.
Line starts with a "8", for the number located in column 35.
With the above substitution rules, the resulting file (using the current file as an example), would result in:
output.txt
10004226549870
2042265498700620
5032000067470318021604226549870062001186
6032001000013122I3000522105000000456000131289
6032002000065455E3000522109000004242000654555
6032003000024639B3000522090000005571000246392
6032004000026354A3000521196000248920000263541
6032005000017596{300052196600000101700017596
6032006000010548E3004867190000001003000105485
6032007000014512B3000522095000003981000145122
750300006747031802160007000172228D0007
8999999067470318021600007000172228D
9000013
**MyCode
my $fn = 'input.txt';
my $wr = 'output.txt';
my %repl = (
1 => "A"
2 => "B"
3 => "C"
4 => "D"
5 => "E"
6 => "F"
7 => "G"
8 => "H"
9 => "I"
0 => "{"
);
open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
open ( my $ww, '>', $wr ) or die "Could not open file '$fn': $!";
my @lines;
while (my $line = <$fh>) {
chomp $line;
push @lines, $line;
if ( $line =~ /^6/) {
foreach my $key (sort keys %repl){
substr($line, 17, 1) =~ s/\b$key\b/$repl{$key}/g
substr($line, 45, 1) =~ s/\b$key\b/$repl{$key}/g
}
}
elsif ($line =~ /^7/) {
foreach my $key (sort keys %repl){
substr($line, 34, 1) =~ s/\b$key\b/$repl{$key}/g
}
}
elsif ($line =~ /^8/) {
foreach my $key (sort keys %repl){
substr($line, 35, 1) =~ s/\b$key\b/$repl{$key}/g
}
}
else {
print $ww $_;
}
}
close $fh;
close $ww;
Upvotes: 3
Views: 216
Reputation: 2317
I generally like to avoid using the regex engine unless I'm matching a pattern. Here's how I might do it.
use strict;
use warnings;
my %table = (
1 => 'A', 2 => 'B', 3 => 'C', 4 => 'D', 5 => 'E',
6 => 'F', 7 => 'G', 8 => 'H', 9 => 'I', 0 => '{',
);
my %subs = (
6 => [17, 45],
7 => [34],
8 => [35],
);
while (my $line = <DATA>) {
chomp $line;
my $cols = $subs{ substr($line, 0, 1) } // [];
for my $col (@$cols) {
next if length($line) < $col;
my $char = substr($line, $col - 1, 1);
substr($line, $col - 1, 1) = $table{$char};
}
print "$line\n";
}
__DATA__
10004226549870
204226549870062001186000000000040008060802
5032000067470318021604226549870062001186
603200100001312293000522105000000456000131289
603200200006545553000522109000004242000654555
603200300002463923000522090000005571000246392
603200400002635413000521196000248920000263541
60320050000175960300052196600000101700017596
603200600001054853004867190000001003000105485
603200700001451223000522095000003981000145122
75030000674703180216000700017222840007
89999990674703180216000070001722284
9000013
Upvotes: 2
Reputation: 66899
This is good and the only problem I can see is that you may have miscounted, as substr counts start with 0. The other thing is that every line needs to be written to the output file, so lose that final else
and just write out the line, changed or not.†
However, I'd offer an improvement. Probing for a character by running a regex for each key is not needed, and is wasteful and expensive. There are other ways, which are simpler and efficient.
Using substr
we need two calls when its replacement is decided dynamically
substr $line, 16, 1, $repl{ substr $line, 16, 1 };
This replaces a substring of length 1
at position 17 (counting starts with 0) with a value in %repl
for a key which is the substring of length 1
at position 17.
But this line on its own can have a problem if the string was shorter than the required position. Ths is in fact the case with the line that gets {
as replacement, which has only 44 characters. Given rather "specific" behavior of substr
with edge cases it's better to not reach for "fixes" but check
# for $col == 45 (etc)
if (length $line >= $col) {
substr $line, $col-1, 1, $repl{ substr $line, $col-1, 1 };
}
Those positions can be prepared in an array to avoid scattered magic numbers or scalars.
Another way is with a regex
$line =~ s/.{16}\K(.)/$repl{$1}/;
This matches 16 characters, and \K
drops those matches so they don't need to be put back, and the next character is captured and then replaced by its value in the hash. Now a possibly non-existing position (past end of string) doesn't require special handling since the match simply fails.
Either of these two above can be used instead of the foreach
loops over the %repl
keys and your code should work as it stands, except for the else
branch which need be removed and $line
just printed. I also hope that you do have
use warnings;
use strict;
somewhere at the beginning, as these are strongly recommended in every program.
Matching a line on either 6
/7
/8
is OK but you can instead build a hash with those numbers as keys, and their values be code references performing the needed replacements. If these numbers change often (or if there can be many more) then consider doing something like that.
† Finally, the print $ww $_;
in the else
branch would draw warnings since once a topicalizer is used in the while
condition (the $line
) then $_
isn't provided anymore (it's undefined).
However, that's probably a matter of posting since it uses $ww
while the defined handle is $wr
, and there are a few other typos.
Upvotes: 3
Reputation: 6798
Define an array @repl
with characters, create hash %match
, use keys of %match
hash as input regex filter.
Read data and if line matches to regex filter pass it for further processing. Regex captures matching digit which we use to extract positions from %match
hash later.
Split input line into an array of digits and replace digit in position of interest with substitute from @repl
array.
Assemble the line back with join
and print result.
use strict;
use warnings;
use feature 'say';
my @repl = split '', '{ABCDEFGHI';
my %match = ( 6 => [17,45], 7 => [34], 8 => [35] );
my $filter = join('|',keys %match);
my $re = qr/^($filter)/;
while(my $line = <DATA>) {
chomp $line;
if( $line =~ /$re/ ) {
my @nums = split '', $line;
for my $pos ( @{$match{$1}} ) {
my $i = $pos-1;
$nums[$i] = $repl[$nums[$i]] if defined $nums[$i];
}
$line = join '', @nums;
}
say $line;
}
__DATA__
10004226549870
204226549870062001186000000000040008060802
5032000067470318021604226549870062001186
603200100001312293000522105000000456000131289
603200200006545553000522109000004242000654555
603200300002463923000522090000005571000246392
603200400002635413000521196000248920000263541
60320050000175960300052196600000101700017596
603200600001054853004867190000001003000105485
603200700001451223000522095000003981000145122
75030000674703180216000700017222840007
89999990674703180216000070001722284
9000013
Output
10004226549870
204226549870062001186000000000040008060802
5032000067470318021604226549870062001186
6032001000013122I300052210500000045600013128I
6032002000065455E300052210900000424200065455E
6032003000024639B300052209000000557100024639B
6032004000026354A300052119600024892000026354A
6032005000017596{300052196600000101700017596
6032006000010548E300486719000000100300010548E
6032007000014512B300052209500000398100014512B
750300006747031802160007000172228D0007
8999999067470318021600007000172228D
9000013
Upvotes: 1