Reputation: 1411
I am familiar with some of the split-apply-combine functions in R, like ddply, but I am unsure how to split a data frame, modify a single variable within each subset, and then recombine the subsets. I can do this manually, but there is surely a better way.
In my example, I am trying to shuffle a single variable (but none of the others) within a group. This is for a permutation analysis, so I am doing it many many times, and would thus like to speed things up.
allS <- split(all, f=all$cp)
for(j in 1:length(allS)){
allS[[j]]$party <- sample(x=allS[[j]]$party)
}
tmpAll <- rbind.fill(allS)
Sample data frame:
all <- data.frame(cp=factor(1:5), party=rep(c("A","B","C","D"), 5))
Thanks for any direction!
Upvotes: 3
Views: 1439
Reputation: 3710
The dplyr
way.
library(dplyr)
all %>% group_by(cp) %>% mutate(party=sample(party))
Upvotes: 2
Reputation: 887118
We can use data.table
. We convert the 'data.frame' to 'data.table' (setDT(all)
), grouped by 'cp', sample
the 'party' and assign (:=
) that output back to the 'party' column.
library(data.table)
setDT(all)[, party:= sample(party) , by = cp]
Upvotes: 4