Reputation: 675
I wasn't able to find a previously posted question that sufficiently answered this question. In previous posts, accepted answers have used shadow_mark to keep previously rendered layers persistent.
How to keep previous layers of data while doing animation in R gganimate?
This is an okay workaround when displaying the output in a scatterplot, but it is not a cumulative measurement and it fails when trying to do, for example, a stacked bar graph.
Consider the following data. I want to build up a cumulative stacked bar graph, using a transition state in my df.
df <- data.frame(t = c(2000, 2000, 2001, 2001, 2002, 2002),
f = c("y", "n", "y", "n", "y", "n"),
x = c("a", "a", "b", "c", "a", "c"),
y = c(2,3,5,1,4,8))
> df
t f x y
1 2000 y a 2
2 2000 n a 3
3 2001 y b 5
4 2001 n c 1
5 2002 y a 4
6 2002 n c 8
I want to display the data from 2000, and in the next layer I want to add the data from 2001 as cumulative with the previous layer. And again, for the next layer, I want to add the data from 2002 as cumulative with 2000 and 2001.
This shows why shadow_mark is not a solution to cumulative data:
ggplot(df, aes(x=x, y=y, fill=f)) +
geom_col() + labs(x=NULL, y=NULL, fill=NULL, title="{closest_state}") +
transition_states(t, transition_length = 2, state_length = 1) +
shadow_mark() + enter_fade() + exit_shrink() + ease_aes('sine-in-out') + theme_bw()
Adding a call to shadow_mark will not achieve the desired results of a cumulative plot. "a" should have a cumulative total of 9.
It could be possible to subset the data into 3 different df's for c(2000)
, c(2000,2001)
, and c(2000,2001,2002)
, and then rbind after creating a new states column, but that seems like a very hacky approach.
Is there a cleaner way to display cumulative data with the tools built into gganimate?
Upvotes: 4
Views: 1887
Reputation: 3806
I found for histograms the trick was to duplicate earlier transition values to ensure a cumulative build.
# packages my mac needs for gganimate to work
if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, gganimate, gifski, png)
# vector of values to plot in histogram
sampling_dist_v1 <- rnorm(1e3)
# create a transition sequence variable
init_seq <- c(1, rep(2,10), rep(3,10))
observed_rates <-
tibble(
observed_rate = sampling_dist_v1,
transition_sequence = c(init_seq, rep(4, length(sampling_dist_v1) - length(init_seq)))
)
# duplicate earlier entries to ensure animation is cumulative
t_sub_4 <-
observed_rates %>%
filter(transition_sequence < 4) %>%
mutate(transition_sequence = 4)
t_sub_3 <-
observed_rates %>%
filter(transition_sequence < 3) %>%
mutate(transition_sequence = 3)
t_sub_2 <-
observed_rates %>%
filter(transition_sequence < 2) %>%
mutate(transition_sequence = 2)
observed_rates <-
bind_rows(
observed_rates,
t_sub_4,
t_sub_3,
t_sub_2
)
# animate
anim <- observed_rates %>%
ggplot(aes(x = observed_rate)) +
geom_histogram(binwidth = .25, fill = 'blue') +
transition_states(
transition_sequence,
state_length = 4,
wrap = FALSE
)
[![enter image description here][1]][1]
[1]: https://i.sstatic.net/vyJ8m.gif
Upvotes: 0
Reputation: 93851
You could create a new column in the data with the additive result for each year and plot that directly. In the code below, we do this with the cumsum
function. We also use complete
to ensure that there's a t
row for every combination of f
, and x
(setting y=0
in these added rows). If we don't do this, the cumulative sum will be incorrect when some years (t
values) are missing for some combinations of f
and x
. All of the data transformations are done on the fly with a dplyr
pipe:
library(tidyverse)
library(gganimate)
ggplot(df %>%
complete(t, nesting(f, x), fill=list(y=0)) %>%
arrange(t) %>%
group_by(x,f) %>%
mutate(y_cum = cumsum(y)),
aes(x=x, y=y_cum, fill=f)) +
geom_col() +
labs(x=NULL, y=NULL, fill=NULL, title="{closest_state}") +
transition_states(t, transition_length = 2, state_length = 1) +
enter_fade() + ease_aes('sine-in-out') +
theme_bw() +
scale_y_continuous(breaks=0:10)
Upvotes: 3