Reputation: 77
I am learning Perl and would like to parse a text file to csv file using Perl. I have a loop that generates the following text file:
//This part is what outputs on the text file
for $row(@$data) {
while(my($key,$value) = each(%$row)) {
print "${key}=${value}, ";
}
print "\n";
}
Text File Output:
name=Mary, id=231, age=38, weight=130, height=5.05, speed=26.233, time=30,
time=25, name=Jose, age=30, id=638, weight=150, height=6.05, speed=20.233,
age=40, weight=130, name=Mark, id=369, speed=40.555, height=5.07, time=30
CSV File Desired Output:
name,age,weight,height,speed,time
Mary,38,130,5.05,26.233,30,
Jose,30,150,6.05,20.233,25,
Mark,40,130,5.04,40.555,30
Any good feedback is welcome!
Upvotes: 2
Views: 1609
Reputation: 66883
The key part here is how to manipulate your data so to extract what need be printed for each line. Then you are best off using a module to produce valid CSV, and Text::CSV is very good.
A program using an array of small hashrefs, mimicking data in the question
use strict;
use warnings;
use feature 'say';
use Text::CSV;
my @data = (
{ name => 'A', age => 1, weight => 10 },
{ name => 'B', age => 2, weight => 20 },
);
my $csv = Text::CSV->new({ binary => 1, auto_diag => 2 });
my $outfile = 'test.csv';
open my $ofh, '>', $outfile or die "Can't open $outfile: $!";
# Header, also used below for order of values for fields
my @hdr = qw(name age weight);
$csv->say($ofh, \@hdr);
foreach my $href (@data) {
$csv->say($ofh, [ @{$href}{@hdr} ]);
}
The values from hashrefs in a desired order are extracted using a hashref slice @{$href}{@hdr}
, what is in general
@{ expression returning hash reference } { list of keys }
This returns a list of values for the given list of keys, from the hashref that the expression in the block {}
must return. That is then used to build an arrayref (an anonymous array here, using []
), what the module's say
method needs in order to make and print a string of comma-separated-values† from that list of values.
Note a block that evaluates to a hash reference, used instead of a hash name that is used for a slice of a hash. This is a general rule that
Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a BLOCK returning a reference of the correct type.
Some further comments
Look over the supported constructor's attributes; there are many goodies
For very simple data you can simply join fields with a comma and print
say $ofh join ',', @{$href}{@hdr};
But it is far safer to use a module to construct a valid CSV record. With the right choice of attributes in the constructor it can handle whatever is legal to embed in fields (some of what can take quite a bit of work to do correctly by hand) and it calls things which aren't
I list column names explicitly. Instead, you can fetch the keys
and then sort
in a desired order, but this will again need a hard-coded list for sorting
The program creates the file test.csv
and prints to it the expected header and data lines.
† But separating those "values" with commas may involve a whole lot more than merely what the acronym for the "CSV format" stands for. A variety of things may come between those commas, including commas, newlines, and whatnot. This is why one is best advised to always use a library. Seeing constructor's options is informative.
The following commentary referred to the initial question. In the meanwhile the problems this addresses were corrected in OP's code and the question updated. I'm still leaving this text for some general comments that can be useful.
As for the code in the question and its output, there is almost certainly an issue with how the data is processed to produce @data
, judged by the presence of keys HASH(address)
in the output.
That string HASH(0x...)
is output when one prints a variable which is a hash reference (what cannot show any of hash's content). Perl handles such a print by stringifying (producing a printable string out of something which is more complex) the reference in that way.
There is no good reason to have a hash reference for a hash key. So I'd suggest that you review your data and its processing and see how that comes about. (Or briefly show this, or post another question with it if it isn't feasible to add that to this one.)
One measure you can use to bypass that is to only use a list of keys that you know are valid, like I show above; however, then you may be leaving some outright error unhandled. So I'd rather suggest to find what is wrong.
Upvotes: 7