Reputation: 49
I have a sample dataframe that is 3600 rows long by 6 columns wide. I want to create plot in R that will show six boxplots, one for each of the 6 columns of data. I am using ggplot. I can create them in excel easy enough (shown below) but want to be able to do it in R as my future dataframes are going to be much larger and R seems to handle large datasets a lot easier.
Using the code below I can plot the first column fine, but can't figure out how to add the data from the other 5 columns.
ggplot(data=df)+
geom_boxplot(aes(x="Label", y=col1))
Upvotes: 0
Views: 1641
Reputation: 16178
Using geom_boxplot
from ggplot2
To get a boxplot for each of your 6 columns with ggplot2
, you need to reshape first your dataframe into a longer format in order to match the grammar of ggplot2
(one column for x values, one column for y values and one or more column as categorical values). Then, you can use ggplot2
and geom_boxplot
function:
Here, an example using the included iris
dataset:
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Using, pivot_longer
function from tidyr
package you can reshape the first 4 columns of this dataset into a longer format:
library(tidyr)
library(dplyr)
iris2 <- iris %>% pivot_longer(cols = Sepal.Length:Petal.Width, names_to =
"Var", values_to = "val")
# A tibble: 600 x 3
Species Var val
<fct> <chr> <dbl>
1 setosa Sepal.Length 5.1
2 setosa Sepal.Width 3.5
3 setosa Petal.Length 1.4
4 setosa Petal.Width 0.2
5 setosa Sepal.Length 4.9
6 setosa Sepal.Width 3
7 setosa Petal.Length 1.4
8 setosa Petal.Width 0.2
9 setosa Sepal.Length 4.7
10 setosa Sepal.Width 3.2
# … with 590 more rows
And then, you can use this new dataset in ggplot2
for getting boxplot for each of values of Var
:
library(ggplot2)
ggplot(iris2, aes(x = Var, y = val, fill = Var))+
geom_boxplot()
Alternative using base r
Without the need to reshape your dataframe, you can get the boxplot right away by using boxplot
function in base r
:
boxplot(iris[,c(1:4)], col = c("red","green","blue","orange"))
Does it answer your question ?
Upvotes: 1