Reputation: 125
I have a dataset with a binary dependent variable and a number of predictors, including participant. I am trying to examine the idiosyncratic effects of different predictors for different participants. In order to do that, I'm trying to look at the effect of interactions between participant id and the other predictors on the dependent variable. I'm using randomForest in R. I can fit the forest successfully, and can produce partial dependence plots for individual variables. What I need, however, are partial dependence plots for pairs of variables - participant + others. Is this possible?
For reference, my code:
data_sample<-data_raw[sample(1:nrow(data_raw),500,replace=F),];
test_rf<-randomForest(perceptually.rhotic~vowel+speaker+modified_clip_start+function_word+year_of_birth+gender+fathers_job_type+prepausal,data=data_sample,ntree=500,mtry=3);
partialPlot(test_rf,pred.dat=data_sample,x.var="speaker");
??? partialPlot(test_rf,pred.dat=data_sample,x.var=c("speaker","vowel"));
Thanks very much in advance for any advice anyone can offer!
Upvotes: 0
Views: 1480
Reputation: 1016
The plotmo R package will plot partial dependencies for all variables and pairs of variables (bivariate dependencies) for "any" model. For example:
library(randomForest)
data(trees)
mod <- randomForest(Volume~., data=trees)
library(plotmo)
plotmo(mod, pmethod="partdep") # plot partial dependencies
which gives
You can specify exactly which variable and variable pairs get plotted using plotmo's all1
, all2
, degree1
, and degree2
arguments. Additional examples are in the vignette for the plotmo package.
Upvotes: 4