MorrisseyJ
MorrisseyJ

Reputation: 1271

gganimate lots of stacked geom_col producing white spaces in the animation

I am building an animation of stacked bar charts (calling geom_col). I have 100 columns. When I generate the animation I get a lot of white space in what should be filled columns.

See the gif below:

enter image description here

That gif is based on about 100k rows of data, so I can't post it all here. Notably, I can't reproduce this in a simpler example:

library('tidyverse')
library('gganimate')

data.frame(time = rep(1:50, 200)) %>%
  arrange(time) %>%
  mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50), 
         class = rep((1:100), 100), 
         value = runif(10000, 0, 1)) %>%
  ggplot(aes(x = class, y = value, fill = type)) +
  geom_col() +
  transition_time(time)

Works fine (ignoring the structure in the above data, but i don't get the white spaces):

enter image description here

I tried adding ease_aes(), enter_fade(), exit_fade(), but none of that worked. Anyone have thoughts on what is causing this?

---UPDATE---

Following the comments I tried filtering the data down to see what was going on. Reducing to just two countries and 5 years of data, the problem appears to be that chunks of data are moving between percentiles. When what I want is for them to just grow and shrink within each percentile. You can see it in the gif below:

enter image description here

The data that produced this is here:

structure(list(country = c("US", "DE", "US", "US", "US", "DE", 
"US", "DE", "US", "DE", "US", "DE", "US", "DE", "DE", "US", "DE", 
"DE", "DE", "US", "DE", "US", "US", "US", "DE", "US", "DE", "US", 
"US", "DE", "DE", "US", "DE", "US", "DE", "DE", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "US", "DE", "US", "DE", "US", "DE", 
"US", "DE", "DE", "US", "DE", "DE", "US", "US", "DE", "US", "US", 
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US", 
"DE", "US", "DE", "DE", "US", "DE", "US", "US", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "DE", "US", "DE", "US", "DE", "US", 
"DE", "US", "US", "DE", "US", "US", "DE", "US", "US", "DE", "US", 
"DE", "US", "DE", "US", "DE"), glob.perc = c(0, 1, 1, 2, 3, 3, 
4, 4, 5, 5, 6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4, 
5, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 
6, 6, 7, 7, 7, 8, 8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 6, 6, 
7, 7, 8, 8, 9, 9, 9, 0, 1, 1, 2, 3, 3, 4, 4, 5, 6, 6, 7, 7, 8, 
8, 9, 9, 0, 1, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 7, 8, 8, 9, 9), 
    avg.income.country = c(437288.3, 95483.3754884956, 140784.030084749, 
    140733.5, 92860.7570361667, 27041.1685330627, 82474.4007614941, 
    22845.1776491941, 75584.1480877374, 20954.7760014288, 70400.3370710519, 
    19852.2326809271, 54038.6152996391, 15598.3057384556, 15170.9872445152, 
    62785.1002246113, 18201.6743099168, 39606.7790727414, 39051.1193095399, 
    450574.9, 89747.1381942579, 143040.424101143, 144413.3, 95281.4131057479, 
    26564.8030858664, 84645.1806598295, 22453.3134663253, 99495.4, 
    58448.7245539485, 16815.8081430027, 15925.4607078112, 67342.4870614877, 
    18775.7716260376, 52078.6261482834, 14908.4732454128, 14586.6597398625, 
    60740.8587598986, 17551.4029073371, 449672.7, 85860.9513060095, 
    138573.062299181, 107999.713224424, 26551.7207203881, 118606.7, 
    81673.5478130351, 22256.5124499113, 74664.7815210055, 20289.8692320157, 
    69424.4509484861, 19130.6427260963, 53441.6796042233, 15011.8413898757, 
    14554.8379632521, 62031.6543795656, 17372.7239256402, 17038.0153770701, 
    59253.6721580242, 478696.8, 87965.3040019279, 141489.41469306, 
    110750.734809188, 28139.4736007857, 121395.4, 84564.2106500617, 
    23136.9326230234, 77452.4071740221, 20809.5254887263, 72187.8010950261, 
    19423.2184457137, 67965.6133547784, 18489.4603327709, 64700.6833849069, 
    17811.5804850837, 50612.3590346861, 14165.4003733601, 13829.472811758, 
    542123.2, 89948.9091254987, 158338.248242006, 156908.9, 104475.681782063, 
    29031.666816329, 92305.5514014955, 23750.4970524401, 107775.8, 
    78090.1791649968, 21282.8059573008, 73283.2631907787, 19808.7465702618, 
    69304.0213872794, 18813.7418777938, 65958.7178466761, 18090.1791160505, 
    559720.3, 92129.3365959901, 159846.146463587, 123870.105638014, 
    30030.7222753586, 135301.9, 94785.176213572, 24358.2621716462, 
    110644.4, 80286.8697338142, 21690.4391200441, 75280.156096728, 
    20090.0002975319, 71136.641950609, 19006.2143886443, 67594.6662796918, 
    18216.0069568407), region = c("Americas", "Europe", "Americas", 
    "Americas", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Europe", "Americas", "Europe", "Americas", "Europe", "Europe", 
    "Americas", "Europe", "Europe", "Europe", "Americas", "Europe", 
    "Americas", "Americas", "Americas", "Europe", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Europe", "Americas", 
    "Europe", "Americas", "Europe", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Americas", "Europe", "Americas", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Europe", "Americas", "Europe", "Europe", 
    "Americas", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Europe", "Americas", "Europe", "Americas", 
    "Americas", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Europe", 
    "Americas", "Europe", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Americas", "Europe", "Americas", "Americas", 
    "Europe", "Americas", "Europe", "Americas", "Europe", "Americas", 
    "Europe"), year = c(1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
    1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 1980L, 
    1980L, 1980L, 1980L, 1980L, 1981L, 1981L, 1981L, 1981L, 1981L, 
    1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 1981L, 
    1981L, 1981L, 1981L, 1981L, 1981L, 1982L, 1982L, 1982L, 1982L, 
    1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 
    1982L, 1982L, 1982L, 1982L, 1982L, 1982L, 1983L, 1983L, 1983L, 
    1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 
    1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1983L, 1984L, 1984L, 
    1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 
    1984L, 1984L, 1984L, 1984L, 1984L, 1984L, 1985L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 1985L, 
    1985L, 1985L, 1985L, 1985L, 1985L)), row.names = c(NA, -110L
), class = c("tbl_df", "tbl", "data.frame"))

The code for the animation is as follows:

df %>%  
  ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) + 
  geom_col(position = 'stack') +
  theme_minimal() +
  labs(subtitle = "Year: {frame_time}", 
       x = element_blank(), 
       y = element_blank(), 
       fill = 'Region') +
  transition_time(year)

My sense is this is not an issue of missing data - at each year the visualization is complete without whitespace. i think its an issue of how the geom_col() transitions.

Upvotes: 1

Views: 181

Answers (2)

Jon Spring
Jon Spring

Reputation: 66765

gganimate is running into trouble with your data set since some year/country/glob.perc values have multiple observations and some have zero. It's assuming (incorrectly) that some of the values you are tracking are moving between glob.perc categories year to year. One way to solve this would be to make there be one and only one value for each year/country/glob.perc combination. Here, I put the avg.income.country at zero for the missing ones. There's probably a smarter way to do this, perhaps with imputed values based on neighboring ones or a regression model.

df %>%  
  group_by(year, region, country, glob.perc) %>%
  summarize(avg.income.country = mean(avg.income.country), n = n()) %>%
  ungroup() %>%
  complete(year, nesting(country, region), glob.perc, fill = list(avg.income.country = 0)) %>%
  
  ggplot(aes(x = glob.perc, y = avg.income.country/1000, fill = region)) + 
  geom_col(position = 'stack', color = "black", alpha = 0.7) +
  theme_minimal() +
  labs(subtitle = "Year: {frame_time}", 
       x = element_blank(), 
       y = element_blank(), 
       fill = 'Region') +
  transition_time(year)
  

enter image description here


Here's a look the number of observations in the original data. Note that some are doubled up and some are missing. This creates an ambiguity for gganimate, since it's unclear whether the unit you're tracking has disappeared (it has) or whether it has moved to another glob.perc category (what gganimate assumed).

df %>%  
  count(year, region, country, glob.perc) %>%
  ggplot(aes(year, glob.perc, fill = n)) +
  geom_tile() +
  facet_wrap(~country)

enter image description here

Plotting these as lines, we can see that something is a little fishy in the underlying data. You might take another look there and see if your glob.perc code is working the way you intend. If the categories mean what they sound like, I would have assumed the lines would not cross.

df %>%  
  filter(country == "DE", glob.perc >= 2) %>%
  ggplot(aes(year, avg.income.country, color = as.character(glob.perc), group = glob.perc)) +
  geom_line() +
  facet_wrap(~country)

enter image description here

Upvotes: 1

MarBlo
MarBlo

Reputation: 4514

What you are showing are the single states at each time steps. But there is no much change in value from one step to the other which leads to the flickering. I believe what you want to see is something like below.

For this I have grouped the data group_by and added a column sum as the cumulated value-data.

The filter is there to limit rendering times.

library('tidyverse')
library('gganimate')

ddf_anim <- data.frame(time = rep(1:50, 200)) %>%
  arrange(time) %>%
  mutate(type = rep(c(rep('A', 100), rep('B', 100)), 50), 
         class = rep((1:100), 100), 
         value = runif(10000, 0, 1)) %>%
  filter(time <10) %>% 
  group_by(class, type) %>% 
  mutate(sum = cumsum(value)) %>% 
  ggplot(aes(x = class, y = sum, fill = type)) +
  geom_col() +
  transition_time(time)


ddf_anim

enter image description here

Upvotes: 0

Related Questions