André_1090
André_1090

Reputation: 63

R Plot boxplots from different Dataframes in one plot

I´m about to analyze some data and stuck with the visualization and can´t get any progress right now.

So, here are dummy dataframes which are similar to the ones I use:

df1<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df2<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df3<-data.frame(replicate(36,sample(0:200,1500,rep=TRUE)))
df4<-data.frame(replicate(9,sample(0:200,1500,rep=TRUE)))

And the problem is the following:

I want to plot Boxplots of each Dataframe as a whole besides each other: So that the boxplots for df1, df2, df3 and df4 are besides each other in one plot. I don´t wanna have each station but the dataframe as a whole in this boxplot.

The boxplots for each dataframe works smoothly:

boxplot(df1, las=2)
boxplot(df2, las=2)
boxplot(df3, las=2)
boxplot(df4, las=2)

I then tried to combine them ggplot:

ggplot(data = NULL, aes(x, y))+
  geom_boxplot(data = df1, aes())+
  geom_boxplot(data = df2, aes())+
  geom_boxplot(data = df3, aes())+
  geom_boxplot(data = df4, aes())

But here i get a error message

Fehler in FUN(X[[i]], ...) : Objekt 'x' nicht gefunden

that something is wrong with the aes(), which is obvious, but i don´t have an idea what i can choose for x & y. Maybe i just think in a too complicated way, but yeah...there´s some link I´m missing.

So i hope everything is understandable and if information is missing then just ask and i add it!

Upvotes: 2

Views: 2189

Answers (1)

stefan
stefan

Reputation: 123978

Maybe this is what you are looking for. To replicate the base R boxplots via ggplot2 you could

  1. Put your df's in a list
  2. Convert the df's to long format for which I use lapply and a helper function which
    • converts the df to long format using tidyr::pivot_longer
    • use forcats::fct_inorder to convert column with the variable names to a factor and preserves the right order as in the original df.
  3. Bind the long df's into one dataframe using e.g. dplyr::bind_rows where I add an id variable
  4. After the data wrangling it's an easy task to make boxplots via ggplot2 whereby I opted for facetting by df.
library(ggplot2)
library(tidyr)
library(dplyr)

df1<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df2<-data.frame(replicate(15,sample(0:200,1500,rep=TRUE)))
df3<-data.frame(replicate(36,sample(0:200,1500,rep=TRUE)))
df4<-data.frame(replicate(9,sample(0:200,1500,rep=TRUE)))

df <- list(df1, df2, df3, df4)

to_long <- function(x) {
  pivot_longer(x, everything()) %>% 
    mutate(name = forcats::fct_inorder(name))
}
df <- lapply(df, to_long)
df <- bind_rows(df, .id = "id")

ggplot(df, aes(name, value)) +
  geom_boxplot() +
  facet_wrap(~id, scales = "free_x")

EDIT To get a boxplot for all columns of a dataframe and the boxplots side-by-side you can do

ggplot(df, aes(id, value)) +
  geom_boxplot()

Upvotes: 3

Related Questions