Reputation: 1271
I am building an animation of stacked bar charts (calling geom_col
). I have 100 columns. When I generate the animation I get a lot of white space in what should be filled columns.
See the gif below:
That gif is based on about 100k rows of data, so I can't post it all here. Notably, I can't reproduce this in a simpler example:
library('tidyverse')
library('gganimate')
data.frame(time = rep(1:50, 200)) %>%
arrange(time) %>%
mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50),
class = rep((1:100), 100),
value = runif(10000, 0, 1)) %>%
ggplot(aes(x = class, y = value, fill = type)) +
geom_col() +
transition_time(time)
Works fine (ignoring the structure in the above data, but i don't get the white spaces):
I tried adding ease_aes()
, enter_fade()
, exit_fade()
, but none of that worked. Anyone have thoughts on what is causing this?
---UPDATE---
Following the comments I tried filtering the data down to see what was going on. Reducing to just two countries and 5 years of data, the problem appears to be that chunks of data are moving between percentiles. When what I want is for them to just grow and shrink within each percentile. You can see it in the gif below:
The data that produced this is here:
structure(list(country = c("US", "DE", "US", "US", "US", "DE",
"US", "DE", "US", "DE", "US", "DE", "US", "DE", "DE", "US", "DE",
"DE", "DE", "US", "DE", "US", "US", "US", "DE", "US", "DE", "US",
"US", "DE", "DE", "US", "DE", "US", "DE", "DE", "US", "DE", "US",
"DE", "US", "US", "DE", "US", "US", "DE", "US", "DE", "US", "DE",
"US", "DE", "DE", "US", "DE", "DE", "US", "US", "DE", "US", "US",
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US",
"DE", "US", "DE", "DE", "US", "DE", "US", "US", "US", "DE", "US",
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US",
"DE", "US", "US", "DE", "US", "US", "DE", "US", "US", "DE", "US",
"DE", "US", "DE", "US", "DE"), glob.perc = c(0, 1, 1, 2, 3, 3,
4, 4, 5, 5, 6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4,
5, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5,
6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 6,
7, 7, 8, 8, 9, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 8,
8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 7, 8, 8, 9, 9),
avg.income.country = c(437288.3, 95483.3754884956, 140784.030084749,
140733.5, 92860.7570361667, 27041.1685330627, 82474.4007614941,
22845.1776491941, 75584.1480877374, 20954.7760014288, 70400.3370710519,
19852.2326809271, 54038.6152996391, 15598.3057384556, 15170.9872445152,
62785.1002246113, 18201.6743099168, 39606.7790727414, 39051.1193095399,
450574.9, 89747.1381942579, 143040.424101143, 144413.3, 95281.4131057479,
26564.8030858664, 84645.1806598295, 22453.3134663253, 99495.4,
58448.7245539485, 16815.8081430027, 15925.4607078112, 67342.4870614877,
18775.7716260376, 52078.6261482834, 14908.4732454128, 14586.6597398625,
60740.8587598986, 17551.4029073371, 449672.7, 85860.9513060095,
138573.062299181, 107999.713224424, 26551.7207203881, 118606.7,
81673.5478130351, 22256.5124499113, 74664.7815210055, 20289.8692320157,
69424.4509484861, 19130.6427260963, 53441.6796042233, 15011.8413898757,
14554.8379632521, 62031.6543795656, 17372.7239256402, 17038.0153770701,
59253.6721580242, 478696.8, 87965.3040019279, 141489.41469306,
110750.734809188, 28139.4736007857, 121395.4, 84564.2106500617,
23136.9326230234, 77452.4071740221, 20809.5254887263, 72187.8010950261,
19423.2184457137, 67965.6133547784, 18489.4603327709, 64700.6833849069,
17811.5804850837, 50612.3590346861, 14165.4003733601, 13829.472811758,
542123.2, 89948.9091254987, 158338.248242006, 156908.9, 104475.681782063,
29031.666816329, 92305.5514014955, 23750.4970524401, 107775.8,
78090.1791649968, 21282.8059573008, 73283.2631907787, 19808.7465702618,
69304.0213872794, 18813.7418777938, 65958.7178466761, 18090.1791160505,
559720.3, 92129.3365959901, 159846.146463587, 123870.105638014,
30030.7222753586, 135301.9, 94785.176213572, 24358.2621716462,
110644.4, 80286.8697338142, 21690.4391200441, 75280.156096728,
20090.0002975319, 71136.641950609, 19006.2143886443, 67594.6662796918,
18216.0069568407), region = c("Americas", "Europe", "Americas",
"Americas", "Americas", "Europe", "Americas", "Europe", "Americas",
"Europe", "Americas", "Europe", "Americas", "Europe", "Europe",
"Americas", "Europe", "Europe", "Europe", "Americas", "Europe",
"Americas", "Americas", "Americas", "Europe", "Americas",
"Europe", "Americas", "Americas", "Europe", "Europe", "Americas",
"Europe", "Americas", "Europe", "Europe", "Americas", "Europe",
"Americas", "Europe", "Americas", "Americas", "Europe", "Americas",
"Americas", "Europe", "Americas", "Europe", "Americas", "Europe",
"Americas", "Europe", "Europe", "Americas", "Europe", "Europe",
"Americas", "Americas", "Europe", "Americas", "Americas",
"Europe", "Americas", "Americas", "Europe", "Americas", "Europe",
"Americas", "Europe", "Americas", "Europe", "Americas", "Europe",
"Americas", "Europe", "Europe", "Americas", "Europe", "Americas",
"Americas", "Americas", "Europe", "Americas", "Europe", "Americas",
"Americas", "Europe", "Americas", "Europe", "Americas", "Europe",
"Americas", "Europe", "Americas", "Europe", "Americas", "Americas",
"Europe", "Americas", "Americas", "Europe", "Americas", "Americas",
"Europe", "Americas", "Europe", "Americas", "Europe", "Americas",
"Europe"), year = c(1980L, 1980L, 1980L, 1980L, 1980L, 1980L,
1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L,
1980L, 1980L, 1980L, 1980L, 1981L, 1981L, 1981L, 1981L, 1981L,
1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L,
1981L, 1981L, 1981L, 1981L, 1981L, 1982L, 1982L, 1982L, 1982L,
1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L,
1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1983L, 1983L, 1983L,
1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L,
1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1984L, 1984L,
1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L,
1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1985L, 1985L, 1985L,
1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L,
1985L, 1985L, 1985L, 1985L, 1985L)), row.names = c(NA, -110L
), class = c("tbl_df", "tbl", "data.frame"))
The code for the animation is as follows:
df %>%
ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) +
geom_col(position = 'stack') +
theme_minimal() +
labs(subtitle = "Year: {frame_time}",
x = element_blank(),
y = element_blank(),
fill = 'Region') +
transition_time(year)
My sense is this is not an issue of missing data - at each year the visualization is complete without whitespace. i think its an issue of how the geom_col()
transitions.
Upvotes: 1
Views: 181
Reputation: 66765
gganimate is running into trouble with your data set since some year/country/glob.perc values have multiple observations and some have zero. It's assuming (incorrectly) that some of the values you are tracking are moving between glob.perc categories year to year. One way to solve this would be to make there be one and only one value for each year/country/glob.perc combination. Here, I put the avg.income.country at zero for the missing ones. There's probably a smarter way to do this, perhaps with imputed values based on neighboring ones or a regression model.
df %>%
group_by(year, region, country, glob.perc) %>%
summarize(avg.income.country = mean(avg.income.country), n = n()) %>%
ungroup() %>%
complete(year, nesting(country, region), glob.perc, fill = list(avg.income.country = 0)) %>%
ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) +
geom_col(position = 'stack', color = "black", alpha = 0.7) +
theme_minimal() +
labs(subtitle = "Year: {frame_time}",
x = element_blank(),
y = element_blank(),
fill = 'Region') +
transition_time(year)
Here's a look the number of observations in the original data. Note that some are doubled up and some are missing. This creates an ambiguity for gganimate, since it's unclear whether the unit you're tracking has disappeared (it has) or whether it has moved to another glob.perc category (what gganimate assumed).
df %>%
count(year, region, country, glob.perc) %>%
ggplot(aes(year, glob.perc, fill = n)) +
geom_tile() +
facet_wrap(~country)
Plotting these as lines, we can see that something is a little fishy in the underlying data. You might take another look there and see if your glob.perc code is working the way you intend. If the categories mean what they sound like, I would have assumed the lines would not cross.
df %>%
filter(country == "DE", glob.perc >= 2) %>%
ggplot(aes(year, avg.income.country, color = as.character(glob.perc), group = glob.perc)) +
geom_line() +
facet_wrap(~country)
Upvotes: 1
Reputation: 4514
What you are showing are the single states at each time steps. But there is no much change in value
from one step to the other which leads to the flickering. I believe what you want to see is something like below.
For this I have grouped the data group_by
and added a column sum
as the cumulated value-data.
The filter
is there to limit rendering times.
library('tidyverse')
library('gganimate')
ddf_anim <- data.frame(time = rep(1:50, 200)) %>%
arrange(time) %>%
mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50),
class = rep((1:100), 100),
value = runif(10000, 0, 1)) %>%
filter(time <10) %>%
group_by(class, type) %>%
mutate(sum = cumsum(value)) %>%
ggplot(aes(x = class, y = sum, fill = type)) +
geom_col() +
transition_time(time)
ddf_anim
Upvotes: 0