jdamae
jdamae

Reputation: 3909

perl help replacing commas and embedding values with ctrl characters

I need some help with tweaking my perl script.

I've got an input file with comma separated values like so:

to_em,from_em,flags,updated,marks
[email protected]#hv,[email protected],16,2007-08-18 16:18:50,33

The first row are the column names to_em from_em flags updated marks and the following record are the values for each column:

to_em = [email protected]#hv
from_em = [email protected]
flags = 16
updated = 2007-08-18 16:18:50
marks = 33

I am also creating a unique value (MD5), prefixed with "__pkey__".

Each column name starts with ^E. Each value starts with ^A, including the hex value. The record will end with ^D.

I want the final output file to look like this:

__pkey__^Ad41d8cd98f00b204e9800998ecf8427e^Eto_em^[email protected]#hv^Efrom_em^[email protected]^Eflags^A16^Eupdated^A2007-08-18 16:18:50^Emarks^A33^E^D

But, its coming out like this:

__pkey__^Ad41d8cd98f00b204e9800998ecf8427e^E^Ato_em^E^D__pkey__^A5c09354d0d3d34c96dbad8fa14ff175e^E^[email protected]#hv^E^D

Here's my code:

use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';
while (<>) {
my $digest = md5_hex($data);
   chomp;
   my ($val) = split /,/;
   $data = $data. "__pkey__^A$digest^E^A$val^E^D";
}
print $data;
exit;

Upvotes: 0

Views: 156

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 754500

This seems to work:

use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';
my $line1 = <>;
chomp $line1;
my @heading = split /,/, $line1;
#my ($sep1, $sep2, $eor) = (chr(1), chr(5), chr(4));
my ($sep1, $sep2, $eor) = ( "^A", "^E", "^D");
while (<>)
{
    my $digest = md5_hex($data);
    chomp;
    my (@values) = split /,/;
    my $extra = "__pkey__$sep1$digest$sep2" ;
    $extra .= "$heading[$_]$sep1$values[$_]$sep2" for (0..$#values);
    #$extra .= "$heading[$_]$sep1$values[$_]$sep2" for (0..scalar(@values)-1);
    #for my $i (0..$#values)
    #{
    #   $extra .= "$heading[$i]$sep1$values[$i]$sep2";
    #}
    $data .= "$extra$eor";
}
print $data;

It reads the first line, chomps it, and splits it into fields into the array @heading.

It reads each subsequent line, chomps it, splits it into fields, runs the digest on it, and then generates the output line.

At the end, it prints all the accumulated data.

If you want actual control characters instead of caret-letter, use the line with chr() instead of the following one.

If you don't like the all-on-one-line loop, use the commented out one.

Upvotes: 1

DVK
DVK

Reputation: 129481

Something like this should get you the kind of output you showed

use strict;
use Digest::MD5 qw(md5_hex);
my $data = '';

my $first_line = <>;
chomp($first_line);
my @columns = split(/,/, $first_line);
while (<>) {
    chomp;
    my (@vals) = split /,/;
    my $record = "";
    foreach my $column_num (0..$#columns) {
        $record .= "^E$columns[$column_num]^A$vals[$column_num]";
    }
    my $digest = md5_hex($data);
    $data = $data. "__pkey__^A$digest$record^D";
}
print $data;
exit;

Upvotes: 1

Related Questions