Reputation: 3909
I need some help with tweaking my perl script.
I've got an input file with comma separated values like so:
to_em,from_em,flags,updated,marks
[email protected]#hv,[email protected],16,2007-08-18 16:18:50,33
The first row are the column names to_em from_em flags updated marks
and the following record are the values for each column:
to_em = [email protected]#hv
from_em = [email protected]
flags = 16
updated = 2007-08-18 16:18:50
marks = 33
I am also creating a unique value (MD5), prefixed with "__pkey__
".
Each column name starts with ^E
. Each value starts with ^A
, including the hex value. The record will end with ^D
.
I want the final output file to look like this:
__pkey__^Ad41d8cd98f00b204e9800998ecf8427e^Eto_em^[email protected]#hv^Efrom_em^[email protected]^Eflags^A16^Eupdated^A2007-08-18 16:18:50^Emarks^A33^E^D
But, its coming out like this:
__pkey__^Ad41d8cd98f00b204e9800998ecf8427e^E^Ato_em^E^D__pkey__^A5c09354d0d3d34c96dbad8fa14ff175e^E^[email protected]#hv^E^D
Here's my code:
use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';
while (<>) {
my $digest = md5_hex($data);
chomp;
my ($val) = split /,/;
$data = $data. "__pkey__^A$digest^E^A$val^E^D";
}
print $data;
exit;
Upvotes: 0
Views: 156
Reputation: 754500
This seems to work:
use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';
my $line1 = <>;
chomp $line1;
my @heading = split /,/, $line1;
#my ($sep1, $sep2, $eor) = (chr(1), chr(5), chr(4));
my ($sep1, $sep2, $eor) = ( "^A", "^E", "^D");
while (<>)
{
my $digest = md5_hex($data);
chomp;
my (@values) = split /,/;
my $extra = "__pkey__$sep1$digest$sep2" ;
$extra .= "$heading[$_]$sep1$values[$_]$sep2" for (0..$#values);
#$extra .= "$heading[$_]$sep1$values[$_]$sep2" for (0..scalar(@values)-1);
#for my $i (0..$#values)
#{
# $extra .= "$heading[$i]$sep1$values[$i]$sep2";
#}
$data .= "$extra$eor";
}
print $data;
It reads the first line, chomps it, and splits it into fields into the array @heading
.
It reads each subsequent line, chomps it, splits it into fields, runs the digest on it, and then generates the output line.
At the end, it prints all the accumulated data.
If you want actual control characters instead of caret-letter, use the line with chr()
instead of the following one.
If you don't like the all-on-one-line loop, use the commented out one.
Upvotes: 1
Reputation: 129481
Something like this should get you the kind of output you showed
use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';
my $first_line = <>;
chomp($first_line);
my @columns = split(/,/, $first_line);
while (<>) {
chomp;
my (@vals) = split /,/;
my $record = "";
foreach my $column_num (0..$#columns) {
$record .= "^E$columns[$column_num]^A$vals[$column_num]";
}
my $digest = md5_hex($data);
$data = $data. "__pkey__^A$digest$record^D";
}
print $data;
exit;
Upvotes: 1