Stefan Gouyet
Stefan Gouyet

Reputation: 137

Filling in Rows with Missing Data

I have a specific code that I want to write in R that I couldn't find an answer to on Stack Overflow. I am manipulating a dataset of continents data and am looking to calculate cumulative values for each year. This is a snapshot of what the df looks like:

Continent   Year  Value     Cumulative Value
   <chr>      <dbl> <dbl>              <dbl>
 1 Europe     2000.   10.                10.
 2 Asia       2000.   30.                30.
 3 Africa     2000.   67.                67.
 4 N. America 2000.   23.                23.
 5 S. America 2000.   19.                19.
 6 Europe     2001.    3.                13.
 7 Asia       2001.    4.                34.
 8 Africa     2001.    3.                70.
 9 Europe     2002.    3.                16.
10 Asia       2002.    9.                43.
11 Africa     2002.    2.                72.
12 N. America 2002.    4.                27.
13 S. America 2002.   90.               109.

My issue is that not every continent has a value every year, yet I still need the cumulative value for that year. The cumulative value for that year would be the same for that specific continent as the previous year.

For example, in 2001, N. America and S. America do not have a row, and I would like both to show up with value = 0 and cumulative value as 23 and 19, respectively, the same as the previous year (in year 2000). I am unsure what code would accomplish this so any advice would be greatly appreciated.

Continent   Year  Value     Cumulative Value
N. America 2001.   0.                23.
S. America 2001.   0.                19.

Let me know if I should provide more information and thanks again!

data

structure(list(Continent = c("Europe", "Asia", "Africa", "N. America", 
"S. America", "Europe", "Asia", "Africa", "Europe", "Asia", "Africa", 
"N. America", "S. America"), Year = c(2000, 2000, 2000, 2000, 
2000, 2001, 2001, 2001, 2002, 2002, 2002, 2002, 2002), Value = c(10, 
30, 67, 23, 19, 3, 4, 3, 3, 9, 2, 4, 90), `Cumulative Value` = c(10, 
30, 67, 23, 19, 13, 34, 70, 16, 43, 72, 27, 109)), .Names = c("Continent", 
"Year", "Value", "Cumulative Value"), row.names = c(NA, -13L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 0

Views: 70

Answers (2)

sbha
sbha

Reputation: 10432

Here's a tidyverse option:

library(tidyverse)

df %>% 
  complete(Continent, Year) %>% 
  replace_na(list(Value = 0)) %>% 
  fill(Cumulative) 

# A tibble: 15 x 4
   Continent   Year Value Cumulative
   <chr>      <int> <dbl>      <int>
 1 Africa      2000    67         67
 2 Africa      2001     3         70
 3 Africa      2002     2         72
 4 Asia        2000    30         30
 5 Asia        2001     4         34
 6 Asia        2002     9         43
 7 Europe      2000    10         10
 8 Europe      2001     3         13
 9 Europe      2002     3         16
10 N. America  2000    23         23
11 N. America  2001     0         23
12 N. America  2002     4         27
13 S. America  2000    19         19
14 S. America  2001     0         19
15 S. America  2002    90        109

Upvotes: 1

Gregor Thomas
Gregor Thomas

Reputation: 146129

This should work, but is untested since your data isn't shared in a copy/pasteable way. Share dput(your_sample_data) and I will test/debug.

library(dplyr)
library(tidyr)
complete(your_data, Continent, Year, fill = list(Value = 0)) %>%
  group_by(Continent) %>%
  mutate(`Cumulative Value` = zoo::na.locf(`Cumulative Value`))
# A tibble: 15 x 4
# Groups:   Continent [5]
   Continent   Year Value    CV
   <chr>      <dbl> <dbl> <dbl>
 1 Africa      2000    67    67
 2 Africa      2001     3    70
 3 Africa      2002     2    72
 4 Asia        2000    30    30
 5 Asia        2001     4    34
 6 Asia        2002     9    43
 7 Europe      2000    10    10
 8 Europe      2001     3    13
 9 Europe      2002     3    16
10 N. America  2000    23    23
11 N. America  2001     0    23
12 N. America  2002     4    27
13 S. America  2000    19    19
14 S. America  2001     0    19
15 S. America  2002    90   109

Upvotes: 2

Related Questions