CptNemo
CptNemo

Reputation: 6755

ggplot: plot geom_boxplot from list of elements of different length

I have this timeseries of elements grouped in a list of length t.

series <- seq(1:10)

lst <- list()

set.seed(28100)
for (t in series) {
  lst[[t]] <- sample(c(1:20, NA), sample(1:20, 1))
}

The length of the list elements can vary; it's clearly not feasible to create a a two-dimensions data.frame to the list:

lst
# [[1]]
# [1]  6  7 12  4 15 20  3
# 
# [[2]]
# [1] 14 18  8 20 NA  6 19  4  9  5  1 13  3 10 12 15
# [17] 11 17
# 
# ...
# 
# [[9]]
# [1]  3  9 12  8 16 15 10 19 14 11  6  2 20 13  5 18
# 
# [[10]]
# [1]  4 20 10  2 12  5 19  1 NA 11 14  7 17

I still want to create a timeseries boxplot (such as this) with geom_boxplot() including outliers of my distributions.

Upvotes: 0

Views: 665

Answers (1)

icj
icj

Reputation: 46

If you are trying to plot series on the x-axis and your sampled values (I'll call them y) on the y-axis, then create a list of data frames and stack them to get the data structure ggplot needs. For example:

library(ggplot2)

# Modify lst into data frames of varying dimension
lst <- lapply(series, function(x) {
  data.frame(series = factor(x, levels = series),
             y = lst[[x]])
})

# Stack the data frames
lst <- do.call(rbind, lst)

# Make the plot
ggplot(lst, aes(x = series, y = y)) +
  geom_boxplot()

Upvotes: 1

Related Questions