Reputation: 31
I repeated an experiment (rep1 and rep2). For each replicate I have two columns (a, sum) and two rows of the tested subjects that belong together (group AA, BB...). For analysis, I would like to randomly assign the collected data (a and sum) to rep1 and rep2. For this, I was trying to randomly select groups and swap "a" and "sum" of rep1 and rep2. I am trying to repeat the random swapping 100 times, creating 100 datasets for analysis.
I came across unique(df$groups)
to specify that the data of each group belongs together. Combined to sample(unique(df$group), 2)
it randomly samples, let's say, 2 groups. But I don't know how to swap the data of the replicates of these selected groups.
Here is an example of the data:
group = c("A", "A", "B", "B", "C", "C")
rep1_a = c(2, 8, 5, 5, 4, 6)
rep1_sum = c(10, 10, 10, 10, 10, 10)
rep2_a = c(3, 8, 4, 5, 5, 6)
rep2_sum = c(11, 11, 9, 9, 11, 11)
df = data.frame(group, rep1_a, rep1_sum, rep2_a, rep2_sum)
# group rep1_a rep1_sum rep2_a rep2_sum
1 A 2 10 3 11
2 A 8 10 8 11
3 B 5 10 4 9
4 B 5 10 5 9
5 C 4 10 5 11
6 C 6 10 6 11
And here is what it should look like, if out of these 3 groups, the replicates of group A are swapped:
group rep1_a rep1_sum rep2_a rep2_sum
1 A 3 11 2 10
2 A 8 11 8 10
3 B 5 10 4 9
4 B 5 10 5 9
5 C 4 10 5 11
6 C 6 10 6 11
Upvotes: 3
Views: 1323
Reputation: 93803
A data.table
version:
library(data.table)
setDT(df)
df[,swap := sample(c(TRUE,FALSE),1), by=group]
rbind(
df[(!swap)],
df[(swap), setNames(.group,rep2_a,rep2_sum,rep1_a,rep1_sum,swap),names(df)) ]
)[order(group)]
It just swaps the columns if the swap
variable returns FALSE
, otherwise the set of rows in the group is returned unchanged.
Upvotes: 0
Reputation: 287
Here's one way of doing it with dplyr. The following code repeats creating the new data set with equal mixture of rep1 and rep2 by group, and doing desired analysis on the data set 100 times.
library(dplyr)
exp_data <- data_frame()
analysis_result <- data_frame()
for (i in 1:100){
# Your new 'experiment' by mixing two real experiment randomly, indicated by 'exp_id'
new_df <- df %>%
group_by(group) %>%
mutate(x = runif(1)) %>%
mutate(repr_a = ifelse(x>0.5,rep1_a,rep2_a), repr_sum = ifelse(x>0.5,rep1_sum,rep2_sum),exp_id=i) %>%
select(exp_id,group,repr_a,repr_sum)
# Your analysis - below is my example
new_analysis <- new_df %>%
group_by(exp_id,group) %>%
summarise(outcome = mean(repr_a*repr_sum))
exp_data <- bind_rows(exp_data,new_df)
analysis_result <- bind_rows(analysis_result,new_analysis)
}
Upvotes: 1