broccoli
broccoli

Reputation: 4846

Probabilistically Sample from an Array in Perl

I have an array in Perl and I would like to draw samples from it in a probabilistic manner. For example in R the function sample does it for me, e.g.

x = c('a','b','c','d')
sample(x,size = 2,prob = c(0.1,0.4,0.4,0.1))

The above code would return b,c more often. How do I do the same in Perl. Is there a module that does it for me? Thanks in advance.

Upvotes: 1

Views: 161

Answers (2)

MrFlick
MrFlick

Reputation: 206496

With basic perl commands you could just do something like

my @vals = ('a','b','c','d');
my @probs = (0.1,0.4,0.4,0.1);
my @result;
for(my $i=0; $i<20; $i++) {
    my $draw = rand(1);
    my $v = 0;
    while ($draw>0 && $draw > $probs[$v]){
        $draw -= $probs[$v];
        $v++;
    }
    push @result, $vals[$v];
}

print join(", ", @result), "\n";

Upvotes: 0

Diab Jerius
Diab Jerius

Reputation: 2320

This may be done in PDL (at scale) using the vsearch function.

use strict;
use warnings;

use PDL;

my @x = qw( a b c d );

my $pdf = pdl( 0.1, 0.4, 0.4, 0.1 );

# vsearch requires a CDF,
my $cdf = $pdf->dcumusumover;
$cdf /= $cdf->max;

# $sample is a piddle containing the indices into @x;
my $sample = vsearch( random(10000), $cdf );

print scalar hist( $sample, 0, 4, 1 ), "\n";

results in

% perl x.pl
[991 3974 4014 1021]

Upvotes: 1

Related Questions