CoderGrrl
CoderGrrl

Reputation: 35

How to write arrays to CSV as columns

I have a hash of arrays. Each hash key is a column name. Each array is the values for that column.

$myHash{column1} = [value1, value2, ..., valueN];
$myHash{column2} = [value1, value2, ..., valueM];
...

Each array is of different length.

I want to output this as a CSV file with each array being a column. I looked at Text::CSV but don't see how to make this structure work with that module.

Thank you.

Upvotes: 0

Views: 2794

Answers (3)

Corion
Corion

Reputation: 3925

This is easy using Text::CSV_XS (or Text::CSV, as they share the same API). Just traverse the arrays in your hash by index/row number:

my @columns = sort keys %myHash;
my $rows = scalar @{ $myHash{ $columns[0] }};
my $writer = Text::CSV_XS->new();
open my $fh, '>', 'output.csv' or die "Couldn't write file: $!";

# Output headers
$writer->print($fh, \@columns);

# Output payload
for my $row (0..$rows) {
    $writer->print($fh, [map { $myHash{$_}->[$row] } @columns]);
};

After the edit of the question, this answer doesn't fully apply anymore. If the arrays are of different length, there is no way to add values for the missing columns.

Upvotes: 1

Stefan Becker
Stefan Becker

Reputation: 5962

As stated in my comment, the problem description is incomplete. My solution assumes:

  1. hash keys are named column1, ..., column9, column10, and so on, i.e. the columns in the generated CSV should be numerically sorted by the embedded number.
  2. the longest array defines how many rows should be in the generated CSV, i.e. undefined values should map to empty cells.
#!/usr/bin/perl
use strict;
use warnings;

use List::Util qw(max);
use Text::CSV;

# read in hash-of-arrays
my %hash;
while (<DATA>) {
    chomp;
    my($column, @values) = split(' ');
    $hash{$column} = \@values;
}

# column sort order (using Schwartzian transform)
my @column_order =
    map  { $_->[0] }               # [ "columnN", N ] -> "columnN"
    sort { $a->[1] <=> $b->[1] }   # numerical sort, ascending
    map  { [ /^(column(\d+))$/ ] } # "columnN" -> [ "columnN", N ]
    keys %hash;

# calculate highest array index
my $max_index =
    max
    map { $#{ $_ } }
    values %hash;

# transpose hash-of-arrays and print CSV to STDOUT
my $csv = Text::CSV->new();
$csv->eol("\n");    
$csv->print(\*STDOUT, \@column_order);
for my $index (0..$max_index) {
    my @row =
        map { $hash{$_}->[$index] // '' }
        @column_order;
    $csv->print(\*STDOUT, \@row);
}

exit 0;

__DATA__
column1   1  2  3
column2   4  5  6 7
column3   8  9
column4  10 11 12
column5  13 14 15 16 17 18
column6
column7  19 20 21 22
column8  23
column9  24 25 26
column10 27 28 29 30

Run with the embedded test data

$ perl dummy.pl
column1,column2,column3,column4,column5,column6,column7,column8,column9,column10
1,4,8,10,13,,19,23,24,27
2,5,9,11,14,,20,,25,28
3,6,,12,15,,21,,26,29
,7,,,16,,22,,,30
,,,,17,,,,,
,,,,18,,,,,

Upvotes: 0

Andrey
Andrey

Reputation: 1818

You can do it in two steps: 1. Covert your hash of arrays into array of arrays 2. Write the csv.

My solution allows you to have blank columns. If you define in your hash columns 1,3,5, columns 2,4 will be blank.

my @twoDarray;
foreach my $row (keys %myHash) {
    my ($colIndex) = $row =~ /(\d+)$/;
    $colIndex--;
    $twoDarray[0][$colIndex] = $row;
    foreach my $rowIndex(0..$#{$myHash{$row}}) {
        $twoDarray[++$rowIndex][$colIndex] = $myHash{$row}->[$rowIndex];
    }
}

my $finalFile = "output.csv";
open my $OUT, ">", $finalFile;
my $csvTo = Text::CSV_XS->new ({ binary => 1, auto_diag => 2, sep_char => ',', allow_whitespace => 1, eol => "\n"});
$csvTo->print ($OUT, $_) for (@twoDarray);
close $OUT;

Upvotes: 0

Related Questions