user1723765
user1723765

Reputation: 6409

Using sample() when x is of varying length

I have a list where different rows are of different length (sometimes length of 1)

I would like to apply sample to each row by using

sapply(1:99,function(x) sample(mat[[]],1))

The problem of course is that whenever the row is of length one sample will choose from 1:x instead of always choose the same number.

Is there a way to force sample to return the same value whenever length is of 1? What is an alternative way to avoid this problem?

Upvotes: 0

Views: 207

Answers (3)

Greg Snow
Greg Snow

Reputation: 49670

You could use the example on the help page ?sample:

resample <- function(x, ...) x[sample.int(length(x), ...)]

Just use the above resample function in place of sample. Or rename it, modify it, etc. if you want it to work a little differently.

To satisfy my own curiosity I did a quick benchmark of the suggestions so far:

library(microbenchmark)

mylist <- lapply( sample( rep( 1:10, 10 ) ), rpois, lambda=3 )

resample <- function(x, ...) x[sample.int(length(x), ...)]
sample1 <- function(x) x[sample.int(length(x), 1)]
ie1 <- function(x) if(length(x)==1) x else sample(x,1)
ie2 <- function(x) ifelse( length(x)==1, x, sample(x,1) )
rep1 <- function(x) { if( length(x) < 2 ) x <- rep(x,2); sample(x,1) }

(out <- microbenchmark( 
    sapply(mylist, resample, size=1),
    sapply(mylist, sample1),
    sapply(mylist, ie1),
    sapply(mylist, ie2),
    sapply(mylist, rep1)
))

With results:

Unit: microseconds
                               expr      min        lq    median        uq      max neval
 sapply(mylist, resample, size = 1)  360.846  388.1455  398.4085  409.4925 2036.169   100
            sapply(mylist, sample1)  339.499  365.7720  375.8300  391.6345 1846.100   100
                sapply(mylist, ie1)  493.853  534.2900  543.3205  561.3840 2091.589   100
                sapply(mylist, ie2) 1225.397 1291.6955 1328.4365 1395.1455 3787.850   100
               sapply(mylist, rep1)  566.926  614.3405  627.2720  649.4405 2178.209   100

Upvotes: 2

Se&#241;or O
Se&#241;or O

Reputation: 17432

Since the 1:x thing is hard coded into sample, the best option is just to use ifelse:

sapply(mat[1:99], function(x) if(length(x)==1) x else sample(x, 1))

Upvotes: 3

Carl Witthoft
Carl Witthoft

Reputation: 21532

Once you have matrix vs. dataframe or whatever it is straightened out, here's a workaround I've used:

vec.len<-length(my_vector)
if (vec.len <2 ) my_vector<-rep(my_vector,2)
sample(my_vector,1)

Upvotes: 0

Related Questions