Reputation: 35
I have a hash of arrays. Each hash key is a column name. Each array is the values for that column.
$myHash{column1} = [value1, value2, ..., valueN];
$myHash{column2} = [value1, value2, ..., valueM];
...
Each array is of different length.
I want to output this as a CSV file with each array being a column. I looked at Text::CSV but don't see how to make this structure work with that module.
Thank you.
Upvotes: 0
Views: 2794
Reputation: 3925
This is easy using Text::CSV_XS (or Text::CSV, as they share the same API). Just traverse the arrays in your hash by index/row number:
my @columns = sort keys %myHash;
my $rows = scalar @{ $myHash{ $columns[0] }};
my $writer = Text::CSV_XS->new();
open my $fh, '>', 'output.csv' or die "Couldn't write file: $!";
# Output headers
$writer->print($fh, \@columns);
# Output payload
for my $row (0..$rows) {
$writer->print($fh, [map { $myHash{$_}->[$row] } @columns]);
};
After the edit of the question, this answer doesn't fully apply anymore. If the arrays are of different length, there is no way to add values for the missing columns.
Upvotes: 1
Reputation: 5962
As stated in my comment, the problem description is incomplete. My solution assumes:
column1
, ..., column9
, column10
, and so on, i.e. the columns in the generated CSV should be numerically sorted by the embedded number.#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(max);
use Text::CSV;
# read in hash-of-arrays
my %hash;
while (<DATA>) {
chomp;
my($column, @values) = split(' ');
$hash{$column} = \@values;
}
# column sort order (using Schwartzian transform)
my @column_order =
map { $_->[0] } # [ "columnN", N ] -> "columnN"
sort { $a->[1] <=> $b->[1] } # numerical sort, ascending
map { [ /^(column(\d+))$/ ] } # "columnN" -> [ "columnN", N ]
keys %hash;
# calculate highest array index
my $max_index =
max
map { $#{ $_ } }
values %hash;
# transpose hash-of-arrays and print CSV to STDOUT
my $csv = Text::CSV->new();
$csv->eol("\n");
$csv->print(\*STDOUT, \@column_order);
for my $index (0..$max_index) {
my @row =
map { $hash{$_}->[$index] // '' }
@column_order;
$csv->print(\*STDOUT, \@row);
}
exit 0;
__DATA__
column1 1 2 3
column2 4 5 6 7
column3 8 9
column4 10 11 12
column5 13 14 15 16 17 18
column6
column7 19 20 21 22
column8 23
column9 24 25 26
column10 27 28 29 30
Run with the embedded test data
$ perl dummy.pl
column1,column2,column3,column4,column5,column6,column7,column8,column9,column10
1,4,8,10,13,,19,23,24,27
2,5,9,11,14,,20,,25,28
3,6,,12,15,,21,,26,29
,7,,,16,,22,,,30
,,,,17,,,,,
,,,,18,,,,,
Upvotes: 0
Reputation: 1818
You can do it in two steps: 1. Covert your hash of arrays into array of arrays 2. Write the csv.
My solution allows you to have blank columns. If you define in your hash columns 1,3,5, columns 2,4 will be blank.
my @twoDarray;
foreach my $row (keys %myHash) {
my ($colIndex) = $row =~ /(\d+)$/;
$colIndex--;
$twoDarray[0][$colIndex] = $row;
foreach my $rowIndex(0..$#{$myHash{$row}}) {
$twoDarray[++$rowIndex][$colIndex] = $myHash{$row}->[$rowIndex];
}
}
my $finalFile = "output.csv";
open my $OUT, ">", $finalFile;
my $csvTo = Text::CSV_XS->new ({ binary => 1, auto_diag => 2, sep_char => ',', allow_whitespace => 1, eol => "\n"});
$csvTo->print ($OUT, $_) for (@twoDarray);
close $OUT;
Upvotes: 0