mich106
mich106

Reputation: 1

for loop with if statement inside to create multiple graphs using ggstatsplot package

I am struggling with my code and I can't figure out which is the actual problem.

Just to give you a bit of context: I am trying to write some code which will help me in performing automatic EDA using the ggstatsplot. I would like to select a target variable in my dataset and on the basis of this, the program has to loop over the remaining columns perfoming different bivariate analysis, depending on the type of variable (it has to use the ggscatterstats if both are numerical, ggbetweenstats if one is a factor and the other is numerical and ggbarstats if both are factors). I am attaching a short db I am using for the experimentations.

how the dataset looks like

The code I am using is the following (let's suppose our target is Upselling hence the code should only procude ggbetweenstats and ggbarstats plots):

library(ggstatsplot)
df <- dataset
target_var <- dataset$Upselling
for (var in 1:ncol(df)) {
if (is.numeric(df[[var]]) && is.numeric(target_var)) {
plots <- ggscatterstats(data = df, x = var, y = target_var)} 
else if (is.numeric(df[[var]]) && is.factor(target_var) || is.factor(df[[var]]) && is.numeric(target_var)) {
plots <- ggbetweenstats(data = df, x = var, y = target_var)} 
else {plots <- ggbarstats(data = df, x = var, y = target_var)}
print(plots)
}

The error I am getting is the following:

Error in select(): ! Can't subset columns that don't exist. ✖ Column var doesn't exist.

Could you please help? Thank you so much

Upvotes: 0

Views: 96

Answers (0)

Related Questions