Reputation: 3121
Example.
If input is
Output should be
So each row must consist only same values or undef
if original unsorted column had no such string. Values in columns should be alphabetically sorted.
How to realize that type of sorting ?
P.S. Original task - we have some modules and we want to compare them visually for functions with similar names.
Upvotes: 1
Views: 157
Reputation: 66901
A little manual but hopefully clear approach to "fill in" the missing spots: Gather a sorted reference of all values and use it to "pad" (with empty) for missing elements in each arrayref-column
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
use List::MoreUtils qw(uniq);
my @data = (
[ qw(abc def ghi xyz) ],
[ qw(def jkl mno uvw xyz) ],
[ qw(abc uvw xyz) ]
);
my @all = uniq sort { $a cmp $b } map { @$_ } @data; # reference
# Changes @data in place. Use on deep copy to preserve the original
for my $ary (@data) {
my $cmp_at = 0;
my @res;
for my $i (0..$#all) {
if ($ary->[$cmp_at] eq $all[$i]) {
push @res, $ary->[$cmp_at];
++$cmp_at;
}
else {
push @res, undef;
}
}
$ary = \@res; # overwrite arrayref in @data
}
dd \@data;
Another way is to go row-wise and inspect and print for each element, again using a reference list. The column-wise preprocessing used here should be more flexible and general, I think.
The above prints
[ ["abc", "def", "ghi", undef, undef, undef, "xyz"], [undef, "def", undef, "jkl", "mno", "uvw", "xyz"], ["abc", undef, undef, undef, undef, "uvw", "xyz"], ]
where string "undef" produced by Data::Dump indicates array entries with nothing.
Now @data
can be used to print in a chosen format, for example
use List::Util qw(max);
my $fmt = '%' . (max map { length } @all) . 's';
say join "\t", map { sprintf $fmt, $_ } qw(Foo Bar Baz);
for my $i (0..$#{$data[0]}) {
say join "\t", map { sprintf $fmt, $_->[$i]//'' } @data;
}
what prints a table in the fashion of the desired "output." For quick alignment I use the width of the longest word for all fields; this is most easily improved by using a text-formatting module.
If this goes to a spreadsheet, like tables in the question, then just join fields with a comma
open my $fh, '>', 'out.csv' or die "Can't open: $!";
say $fh join ',', qw(Foo Bar Baz);
foreach my $i (0..$#{$data[0]}) {
say $fh join ',', map { $_->[$i]//'' } @data;
}
Upvotes: 2
Reputation: 33
Just another different solution using hashes and arrays and Hash::Merge. Looking at your example I concluded that each value from the table can be stored as a single hash entry as long as you keep track in which column they belong. After the unsorted hash has been created the program finishes with a example print algorithm to extract the data in the format that you want.
use warnings;
use strict;
use Hash::Merge ;
my @data = (
{ name => 'Foo', funcs => [qw/abc def ghi xyz/] },
{ name => 'Bar', funcs => [qw/def jkl mno uvw xyz/] },
{ name => 'Baz', funcs => [qw/abc uvw xyz/] },
);
my $merger = Hash::Merge->new('RETAINMENT_PRECEDENT');
my $unsorted = {} ;
for my $i ( 0..$#data) {
my $tmpH = {} ;
foreach( @{$data[$i]->{funcs}} ) {
if( exists $tmpH->{ $_ } ) {
push @{$tmpH->{ $_ }}, $i ;
} else {
$tmpH->{ $_ } = [ $i ] ;
}
} ;
$unsorted = $merger->merge( $unsorted, $tmpH ) ;
}
print "Foo\tBar\tBaz\n" ;
foreach ( sort keys %{$unsorted} ) {
my @txt;
@txt[@{$unsorted->{ $_ }}] = ($_) x @{$unsorted->{ $_ }} ;
{
no warnings 'uninitialized';
printf("%s\t%s\t%s\n", $txt[0], $txt[1], $txt[2]) ;
}
}
__END__
Foo Bar Baz
abc abc
def def
ghi
jkl
mno
uvw uvw
xyz xyz xyz
Upvotes: 1
Reputation: 5962
The following code should do it, output format is CSV:
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 2 });
my @headers = qw(Foo Bar Baz);
my @columns = (
[qw/abc def ghi xyz/],
[qw/def jkl mno uvw xyz/],
[qw/abc uvw xyz/],
);
my %output;
my $N = scalar(@columns);
my @new_row = ('') x $N;
foreach my $index (0..$N-1) {
my $column = $columns[$index];
foreach my $key (@{ $column }) {
$output{$key} ||= [ @new_row ];
$output{$key}->[$index] = $key;
}
}
$csv->say(\*STDOUT, \@headers);
# NOTE: add {....} after sort for special sorting requirements
foreach my $key (sort keys %output) {
$csv->say(\*STDOUT, $output{$key});
}
Example output:
$ perl dummy.pl
Foo,Bar,Baz,
abc,,abc,
def,def,,
ghi,,,
,jkl,,
,mno,,
,uvw,uvw,
xyz,xyz,xyz,
NOTE: the above is with the default sort algorithm, but you can insert your own, e.g.
# reverse sort
foreach my $key (sort { $b cmp $a } keys %output) {
Thanks for the question. This was fun :-)
EDIT2: guessing from the format of the question the input might also be in CSV, so a row-based variant of the algorithm might be more appropriate.
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 2 });
my $headers;
my @new_row;
my %keys;
my $line = 0;
while ( my $row = $csv->getline(\*STDIN) ) {
if ($line == 0) {
$headers = $row;
@new_row = ('') x @$row;
} else {
foreach my $index (0..$#$row) {
my $key = $row->[$index];
$keys{$key} ||= [ @new_row ];
$keys{$key}->[$index] = $key;
}
}
$line++;
}
# delete "undefined" key
delete $keys{''};
$csv->say(\*STDOUT, $headers);
# NOTE: add {....} after sort for special sorting requirements
$csv->say(\*STDOUT, $keys{$_}) foreach (sort keys %keys);
Example output:
$ cat input.csv
Foo,Bar,Baz
abc,def,abc
def,jkl,uvw
ghi,mno,xyz
xyz,uvw,
,xyz,
$ perl dummy.pl <input.csv
Foo,Bar,Baz
abc,,abc
def,def,
ghi,,
,jkl,
,mno,
,uvw,uvw
xyz,xyz,xyz
Upvotes: 1
Reputation: 3013
Something like this maybe?
use warnings;
use strict;
my @data = (
{ name => 'Foo', funcs => [qw/abc def ghi xyz/] },
{ name => 'Bar', funcs => [qw/def jkl mno uvw xyz/] },
{ name => 'Baz', funcs => [qw/abc uvw xyz/] },
);
my %allfuncs = ( map { map {$_=>undef} @{$$_{funcs}} } @data );
$$_{funcs} = { %allfuncs, map {$_=>1} @{$$_{funcs}} } for @data;
use Data::Dump;
dd @data;
# just for output:
use List::Util qw/max/;
my $maxlen = max map {length} map({$$_{name}} @data), keys(%allfuncs);
my $fmt = join(' ', ("%${maxlen}s") x @data)."\n";
printf $fmt, map { $$_{name} } @data;
for my $f (sort keys %allfuncs) {
printf $fmt, map { $$_{funcs}{$f}?$f:'' } @data;
}
Output:
(
{
funcs => { abc => 1, def => 1, ghi => 1, jkl => undef, mno => undef, uvw => undef, xyz => 1 },
name => "Foo",
},
{
funcs => { abc => undef, def => 1, ghi => undef, jkl => 1, mno => 1, uvw => 1, xyz => 1 },
name => "Bar",
},
{
funcs => { abc => 1, def => undef, ghi => undef, jkl => undef, mno => undef, uvw => 1, xyz => 1 },
name => "Baz",
},
)
Foo Bar Baz
abc abc
def def
ghi
jkl
mno
uvw uvw
xyz xyz xyz
Update: If your input data comes in the form of a AoA, this takes @table
and produces the same @data
as above (it basically transposes the AoA and then produces the hash structure):
my @table = ( [qw/Foo Bar Baz/], [qw/abc def abc/], [qw/def jkl uvw/],
[qw/ghi mno xyz/], [qw/xyz uvw/], [undef, qw/xyz/] );
my @data;
for my $col ( 0 .. $table[0]->$#* )
{ push @data, [ map {$_->[$col]//()} @table ] }
@data = map { {name=>shift @$_, funcs=>$_} } @data;
And if you need your output format to be an AoA as well:
my @out = ( [map {$$_{name}} @data] );
for my $f (sort keys %allfuncs)
{ push @out, [ map {$$_{funcs}{$f}?$f:undef} @data ] }
Produces @out
:
(
["Foo", "Bar", "Baz"],
["abc", undef, "abc"],
["def", "def", undef],
["ghi", undef, undef],
[undef, "jkl", undef],
[undef, "mno", undef],
[undef, "uvw", "uvw"],
["xyz", "xyz", "xyz"],
)
Upvotes: 2