Reputation: 13
Suppose I have a binomial distribution where n=12, p=0.2. I split this sample into 4 chunks(groups), each chunk has group size 3. Then I remove the output whose sum is equal to 0. For the remaining outputs, what I'm trying to do is combining all remaining outputs into a new vector. Here's my code
set.seed(123)
sample1=rbinom(12,1,0.2)
chuck2=function(x,n)split(x,cut(seq_along(x),n,labels=FALSE))
chunk=chuck2(sample1,4)
for (i in 1:4){
aa=chunk[[i]]
if (sum(aa)!=0){
a.no0=aa
print(a.no0)
}
}
And here's the output:
[1] 1 1 0
[1] 0 1 0
[1] 0 1 0
I want to combine these three outputs into a new vector like:
[1] 1 1 0 0 1 0 0 1 0
but I have no idea how it works, any hints please?
Upvotes: 1
Views: 69
Reputation: 11255
It seems like your function makes a pseudo matrix as a list. This instead directly makes a matrix from sample1
and then outputs a vector where rowSums
are greater than 0.
set.seed(123)
sample1 = rbinom(12, 1, 0.2)
chunk_mat = matrix(sample1, ncol = 3, byrow = T)
as.vector(t(chunk_mat[which(rowSums(chunk_mat) != 0), ]))
Here are benchmarks - I have the chuck2
in the global environment but each function still has to generate the chunk
dataframe / matrix / list so that they're apples to apples.
Unit: microseconds
expr min lq mean median uq max neval
cole_matrix 19.902 26.2515 38.60094 43.3505 47.4505 56.801 100
heds_int_vector 4965.201 5101.9010 5616.53893 5251.8510 5490.9010 23417.401 100
bwilliams_dplyr 5278.602 5506.4010 5847.55298 5665.7010 5821.5515 9413.801 100
Simon_base 128.501 138.0010 196.46697 185.6005 203.1515 2481.101 100
Simon_magrittr 366.601 392.5005 453.74806 455.1510 492.0010 739.501 100
Upvotes: 0
Reputation: 81
Two versions without for loop.
data:
set.seed(123)
sample1 <- rbinom(12, 1, 0.2)
base-R functional version:
split.sample1 <- split(sample1,cut(seq_along(sample1),4,labels=FALSE))
sumf <- function(x) if(sum(x) == 0) NULL else x
result <- unlist(lapply(split.sample1,sumf),use.names=F)
> result
[1] 1 1 0 0 1 0 0 1 0
modern use of pipe %>%
operator version:
library(magrittr) # for %>% operator
grp.indx <- cut(seq_along(sample1),4,labels=FALSE)
split.sample1 <- sample1 %>% split(grp.indx)
result <- split.sample1 %>% lapply(sumf) %>% unlist(use.names=F)
> result
[1] 1 1 0 0 1 0 0 1 0
Upvotes: 0
Reputation: 3438
set.seed(123)
sample1=rbinom(12,1,0.2)
chuck2=function(x,n)split(x,cut(seq_along(x),n,labels=FALSE))
chunk=chuck2(sample1,4)
int_vector <- c()
for (i in 1:4){
aa=chunk[[i]]
if (sum(aa)!=0){
a.no0=aa
int_vector <- c(int_vector, a.no0)
}
}
int_vector
# [1] 1 1 0 0 1 0 0 1 0
Upvotes: 2
Reputation: 2050
Doesn't directly address your issue, but this can be accomplished without a for-loop:
library(dplyr)
set.seed(123)
sample1 <- rbinom(12, 1, 0.2)
as.data.frame(matrix(sample1, ncol = 3, byrow = TRUE)) %>%
mutate(test = rowSums(.), id = 1:n()) %>%
filter(test > 0) %>%
dplyr::select(-test) %>%
gather(key, value, -id) %>%
arrange(id, key) %>%
.$value
Upvotes: 0
Reputation: 1111
Create a list()
and assign it a variable name. Next, you add that variable inside the loop, then append
the looping values in the list.
new_vector <- list()
for (i in 1:4){
aa=chunk[[i]]
if (sum(aa)!=0){
a.no0=aa
new_vector <- append(new_vector, a.no0)
}
}
new_vector
This will return:
[[1]]
[1] 1
[[2]]
[1] 1
[[3]]
[1] 0
[[4]]
[1] 0
[[5]]
[1] 1
[[6]]
[1] 0
[[7]]
[1] 0
[[8]]
[1] 1
[[9]]
[1] 0
But I think you want a flattened vector:
as.vector(unlist(new_vector))
[1] 1 1 0 0 1 0 0 1 0
Upvotes: 0