mas2
mas2

Reputation: 65

Multiple boxplots side by side

I'm trying to make multiple boxplots with ggplot2 side by side. I've been following the stes Multiple boxplots placed side by side for different column values in ggplot but without much luck.

I have the following dataframes

Raw <- sp500_logreturns
Normal <- rnorm(1000, 0, sd(sp500_logreturns)
Student <- cbind(c(rt(1000, df = 2)),c(rt(1000, df = 3)))

And I want to make the following Boxplot sketch

My Raw vector contains logreturns-transformation of my prices downloaded as an environment from yahoo into R. I must admit I'm quite lost, and do not know if I'm on an impossible mission. I hope I've described my problem well enough together with my sketch. Thank you in advance.

Update 1: The goal is to compare the raw data distribution (which is leptokurtic) and therefore a student disitribution with 2 or 3 degree of freedom might be more suitable than a normal distribution. To give you an idea of the data I'm looking at, here's a summary:

      Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
-0.0418425 -0.0023740  0.0005898  0.0004704  0.0045065  0.0484032  

Here is my boxplot made from Edward's code: Boxplot (Edward)

Update 2: I figured it out. I used fitdist from rugarch to find out the best student distribution fitted to the raw data. This way I could ignore trying to match different dfs of the student distribution. This is what I will go on with:

fitdist(distribution = 'std', sp500_logreturns)$pars
          mu        sigma        shape 
0.0008121004 0.0113748869 2.3848231857 

data <- data.frame(
        Raw = as.numeric(sp500_logreturns),
        Normal = rnorm(1006, 0, sd(sp500_logreturns)),
        Student = rdist(distribution = 'std', n = 1006, mu = 0.0008121004, sigma = 0.0113748869, shape = 2.3848231857)
)

data2 <- pivot_longer(data, cols=everything()) %>%
        mutate(name=factor(name, levels=c("Raw","Normal","Student")))

data3 <- data2 %>% summarise(min=min(value), max=max(value))

pbox1 <- (filter(data2, name %in% c("Raw","Normal","Student")) %>%
        ggplot(aes(y=value, fill=name)) +
        geom_boxplot() +
        facet_grid(~name) +
        ylab("Log-returns") +
        ylim(data3$min, data3$max) +
        theme(legend.position = "none",
              axis.ticks.x=element_blank(),
              panel.grid.major.x = element_blank(),
              panel.grid.minor.x = element_blank(),
              axis.text.x=element_text(color="white"))+
        ggtitle("Boxplot comparison")+
        theme(plot.title = element_text(hjust = 0.5)))

And this gives me: Boxplot (final)

Upvotes: 0

Views: 2430

Answers (1)

Edward
Edward

Reputation: 18598

In base R:

set.seed(11)
data <- data.frame(
  Raw = rnorm(1000),
  Normal = rnorm(1000),
  Student = cbind(c(rt(1000, df = 2)),c(rt(1000, df = 3)))
)

ylim=c(min(data), max(data))

layout(matrix(1:3, nc=3), widths=c(5,4,5))
par(las=1, mar=c(2,4,5,0))
boxplot(daat$Raw, col="steelblue", ylab="Log-returns", ylim=ylim)
title(main="Raw", line=1)

par(mar=c(2,1,5,0))
boxplot(data$Normal, yaxt="n", col="tomato", ylim=ylim)
title(main="Normal", line=1)

par(mar=c(2,1,5,1))
boxplot(data[,3:4], yaxt="n", col=c("green1","green3"), names=c("df = 2","df = 3"), ylim=ylim)
title(main="Student", line=1)
title(main="Boxplot comparison", outer=TRUE, line=-1.5, cex.main=1.5)

enter image description here


In ggplot2, more work is invovled:

set.seed(11)
data <- data.frame(
  Raw = rnorm(1000),
  Normal = rnorm(1000),
  Student = cbind(c(rt(1000, df = 2)),c(rt(1000, df = 3)))
)

library(dplyr)
library(tidyr)
library(ggplot2)

data2 <- pivot_longer(data, cols=everything()) %>%
  mutate(name=factor(name, levels=c("Raw","Normal","Student.1","Student.2")))

data3 <- data2 %>% summarise(min=min(value), max=max(value))

p1 <- filter(data2, name %in% c("Raw","Normal")) %>%
  ggplot(aes(y=value, fill=name)) +
  geom_boxplot() +
  facet_grid(~name) +
  ylab("Log-returns") +
  ylim(data3$min, data3$max) +
  theme_bw() +
  theme(legend.position = "none",
        axis.ticks.x=element_blank(),
        panel.grid.major.x = element_blank(),
        panel.grid.minor.x = element_blank(),
        axis.text.x=element_text(color="white"))

p2 <- filter(data2, grepl("Student", name)) %>%
  mutate(what="Student") %>%
  ggplot(aes(x=name, y=value, fill=name)) +
  geom_boxplot() +
  scale_fill_manual(values=c("green1","green3")) +
  scale_x_discrete(labels=c("df=2", "df=3")) +
  facet_grid(~what) +
  ylim(data3$min, data3$max) +
  theme_bw() +
  theme(legend.position = "none",
        axis.title.y = element_blank(),
        axis.title.x=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank())

library(ggpubr)
ggarrange(p1, p2)

enter image description here

Upvotes: 2

Related Questions