Reputation: 521
I want to create a hash with column header as hash key and column values as hash values in Perl.
For example, if my csv file has the following data:
A,B,C,D,E
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15 ...
I want to create a hash as follows:
A=> 1,6,11
B=>2,7,12
c=>3,8,13 ...
So that just by using the header name I can use that column values. Is there a way in PERL to do this? Please help me.
I was able to store required column values as array using the following script
use strict;
use warnings;
open( IN, "sample.csv" ) or die("Unable to open file");
my $wanted_column = "A";
my @cells;
my @colvalues;
my $header = <IN>;
my @column_names = split( ",", $header );
my $extract_col = 0;
for my $header_line (@column_names) {
last if $header_line =~ m/$wanted_column/;
$extract_col++;
}
while ( my $row = <IN> ) {
last unless $row =~ /\S/;
chomp $row;
@cells = split( ",", $row );
push( @colvalues, $cells[$extract_col] );
}
my $sizeofarray = scalar @colvalues;
print "Size of the coulmn= $sizeofarray";
But I want to do this to all my column.I guess Hash of arrays will be the best solution but I dont know how to implement it.
Upvotes: 1
Views: 2606
Reputation: 4709
Another solution by using for
loop :
use strict;
use warnings;
my %data;
my @columns;
open (my $fh, "<", "file.csv") or die "Can't open the file : ";
while (<$fh>)
{
chomp;
my @list=split(',', $_);
for (my $i=0; $i<=$#list; $i++)
{
if ($.==1) # collect the columns, if its first line.
{
$columns[$i]=$list[$i];
}
else #collect the data, if its not the first line.
{
push @{$data{$columns[$i]}}, $list[$i];
}
}
}
foreach (@columns)
{
local $"="\,";
print "$_=>@{$data{$_}}\n";
}
Output will be like this :
A=>1,6,11
B=>2,7,12
C=>3,8,13
D=>4,9,14
E=>5,10,15
Upvotes: 1
Reputation: 53498
Text::CSV
is a useful helper module for this sort of thing.
use strict;
use warnings;
use Text::CSV;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $csv = Text::CSV->new( { binary => 1 } );
my @headers = @{ $csv->getline($input) };
while ( my $row = $csv->getline($input) ) {
for my $header (@headers) {
push( @{ $combined{$header} }, shift(@$row) );
}
}
print Dumper \%combined;
Since you requested without a module - you can use split
but you need to bear in mind the limitations. CSV
format allows for things like commas nested in quotes. split
won't handle that case very well.
use strict;
use warnings;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $line = <$input>;
chomp ( $line );
my @headers = split( ',', $line );
while (<$input>) {
chomp;
my @row = split(',');
for my $header (@headers) {
push( @{ $combined{$header} }, shift(@row) );
}
}
print Dumper \%combined;
Note: Both of these will effectively ignore any extra columns that don't have headers. (And will get confused by duplicate column names).
Upvotes: 4