Roose DPogi
Roose DPogi

Reputation: 33

Transposing CSV data in Perl

I am a Perl beginner and currently working on a Perl script to automate some of our tasks. One script that I'm working on involves extracting performance data from our system, storing it in CSV files and generating Excel graphs. After a few days of working on this script, I have managed to get the extracted data into CSV, but now, I am having hard time in trying to transpose the data! I have seen this thread (thanks to dalton for the script): stackoverflow thread, but I can't seem to apply it in my case.

Basically, my CSV file contains a daily data per row, with the columns as the hours of the day (24 hours):

29-Aug-2013,3.68,3.63,3.75,3.65,3.65,3.11,3.34,2.74,2.83,2.52,3.19,4.24,3.84,3.61,3.69,2.96,2.76,2.91,3.70,3.82,3.70,3.54,2.54,3.90
30-Aug-2013,3.46,2.97,3.83,3.55,3.41,3.47,3.32,2.81,2.80,2.32,3.17,3.60,3.63,3.83,3.67,2.92,2.34,3.21,3.45,3.51,3.57,3.46,3.52,4.19
31-Aug-2013,3.19,3.50,4.01,3.91,3.71,3.33,3.20,2.95,2.90,2.37,3.07,3.48,2.86,3.29,3.22,2.52,1.83,2.83,3.54,3.49,3.62,3.59,3.54,3.31
01-Sep-2013,2.88,3.16,2.79,2.90,3.78,3.18,3.26,2.84,3.21,2.50,3.35,3.78,3.30,4.04,3.80,3.07,3.23,3.54,3.30,3.43,3.56,3.48,3.60,3.78
02-Sep-2013,3.28,2.92,3.89,3.78,3.54,3.09,3.08,2.79,2.87,2.43,2.70,3.64,3.79,3.88,3.88,3.28,2.90,3.37,3.25,3.60,3.45,3.39,2.84,4.07
03-Sep-2013,3.31,2.54,3.59,3.59,3.50,3.10,2.98,2.63,3.20,2.53,2.92,3.42,3.76,3.07,3.41,2.42,2.12,3.19,3.32,3.08,3.63,3.50,3.71,3.75
04-Sep-2013,3.64,3.48,2.86,3.57,3.68,3.53,3.34,2.89,2.79,2.64,3.30,4.04,4.17,3.70,3.81,2.96,3.41,3.48,3.66,3.05,3.23,3.41,3.15,4.31

Now, I want to transpose it so that the resulting data that I will write to a new CSV file will look something like this:

Time,29-Aug-2013,30-Aug-2013,1-Sep-2013,2-Sep-2013,3-Sep-2013,4-Sep-2013
01:00,3.68,3.46,3.19,2.88,3.28,3.31,3.64
02:00,3.63,2.97,3.50,3.16,2.92,2.54,3.48
03:00,3.75,3.83,4.01,2.79,3.89,3.59,2.86
...

Now, my script looks like this:

my @rows = ();
my @transposed = ();

open F1,"D:\\Temp\\perf_data.csv";
while(<F1>) {
    chomp;
    push @rows, split [ /,/ ];
}
#print @rows;

for my $row (@rows) {
  for my $column (0 .. $#{$row}) {
    push(@{$transposed[$column]}, $row->[$column]);
  }
}

for my $new_row (@transposed) {
  for my $new_col (@{$new_row}) {
      print $new_col, ",";
  }
  print "\n";
}

I can't even get a result from this already! Can someone help give me some hints on how I can do this? Thanks in advance!

Upvotes: 3

Views: 3395

Answers (2)

LanDenLabs
LanDenLabs

Reputation: 1656

Here is my Perl program to transpose row data into columns. A row starts with the headline name followed by one or more values. In my case I needed to remove the date (mm/dd/yyyy) from the headlines so the remainder of the headline field would be unique across multiple rows.

sub usage { << "EOF";

Convert rows to columns.
Remove dates from column headings. 

Usage:
    perl $0
Example:
   $0 data-to-transpose.txt

Source data:
    header1, row1Value1, row2Value2
    header2, row2Value1
    header3 11/31/2011, row3Value1, row3Value2
Output:
    header1, header2, header3
    row1Value1, row2Value1, row3Value1
    row1Value2, , row3Value2

EOF
}
#
#-------------------------------------------------------------------------------

use 5.010;
use strict;
use warnings;

# use Data::Dumper;
sub printColumns;

my $inFile = shift or die usage();
# @ARGV = ('.') unless @ARGV;

my @headers;        # Order list of column headers
my %data;           # map{colHeader, arrayColSourceData }
my $colCnt = 0;     # maximum number of columns in source data, header, value1, value2, ....
my $printColHeaders = 1;

my %hasharray; open (my $fh, "<", $inFile) or die "can't open the $inFile";
while (<$fh>) {
    chomp;
    my @parts = split /,/; 

    if (@parts > 1) {
        # Remove date from heading field
        (my $header = $parts[0]) =~ s/[0-9]+\/[0-9]+\/[0-9]+//;

        if (!exists $data{$header}) {
           push @headers, $header;
        }

        my $have = $data{$header};
        if (defined $data{$header}) {
            if ($printColHeaders == 1) {
                $printColHeaders = 0;
                foreach my $col (@headers) {
                    print "$col,";
                }
                print "\n";
            }

            printColumns();

            foreach my $col (@headers) {
                 $data{$col} = undef;
            }
        } 

        $data{$header} = \@parts;
        $colCnt = (@parts > $colCnt) ? @parts : $colCnt;
    }
} 

printColumns();
print "\n";

foreach my $col (@headers) {
    print "$col,";
}
print "\n";

#### Subs 
sub printColumns() {
    for my $row (1 .. $colCnt-1) {
        foreach my $colHeader (@headers) {
            my $colData = $data{$colHeader};
            if (defined $colData) {
                my $len=@$colData;
                if (defined $colData && $row < @$colData) {
                    print "$colData->[$row], ";
                } else {
                    print ", ";
                }
            } else {
                print ", ";
            }
        }
        print "\n";
    } 
}

Upvotes: 0

TLP
TLP

Reputation: 67908

You made one simple, but critical mistake.

split [ /,/ ] 

should be

[ split /,/ ]

The syntax for split is

split /PATTERN/, EXPR, LIMIT

Where the latter two are optional. What you are doing is passing an anonymous array ref as PATTERN, which most likely gets stringified into something like ARRAY(0x54d658). The result is that the line is not split, and the whole line is pushed onto the array. Later on, that will cause the dereference of $row to fail with the error

Can't use string ("29-Aug-2013,3.68,3.63,3.75,3.65,"...) as an ARRAY ref while "
strict refs" in use at foo.pl line 18, <F1> line 7.

Upvotes: 3

Related Questions