bretauv
bretauv

Reputation: 8557

Expand rows per group while changing only one variable

I have a data frame like this:

library(tibble)

test <- tibble(
  id = rep(c("A", "B"), each = 3),
  time = rep(2000:2002, times = 2),
  value = c(2, 1, 10, 4, 3, 6)
)

# A tibble: 6 x 3
  id     time value
  <chr> <int> <dbl>
1 A      2000     2
2 A      2001     1
3 A      2002    10
4 B      2000     4
5 B      2001     3
6 B      2002     6

For each id, I want to expand the time period back to 1995. The value from 1995 to 2000 will stay the same, the only thing that must change is the year. Here's the expected output:

# A tibble: 16 x 3
   id     time value
   <chr> <int> <dbl>
 1 A      1995     2
 2 A      1996     2
 3 A      1997     2
 4 A      1998     2
 5 A      1999     2
 6 A      2000     2
 7 A      2001     1
 8 A      2002    10
 9 B      1995     4
10 B      1996     4
11 B      1997     4
12 B      1998     4
13 B      1999     4
14 B      2000     4
15 B      2001     3
16 B      2002     6

How can I do that? (if possible in tidyverse style)

Upvotes: 0

Views: 81

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389055

We can use complete to create rows for time from 1995 to max time value for each id and fill the NA values with latest value available.

library(dplyr)
library(tidyr)

test %>%
  group_by(id) %>%
  complete(time = 1995:max(time)) %>%
  fill(value, .direction = 'up')

#   id     time value
#   <chr> <int> <dbl>
# 1 A      1995     2
# 2 A      1996     2
# 3 A      1997     2
# 4 A      1998     2
# 5 A      1999     2
# 6 A      2000     2
# 7 A      2001     1
# 8 A      2002    10
# 9 B      1995     4
#10 B      1996     4
#11 B      1997     4
#12 B      1998     4
#13 B      1999     4
#14 B      2000     4
#15 B      2001     3
#16 B      2002     6

Upvotes: 1

Wimpel
Wimpel

Reputation: 27742

a data.table approach

library( data.table )
setDT(test)
#create all combinations of id and year
DT <- CJ( id = unique( test$id ), time = 1995:2002 )
#rolling join value by ID and time (roll to nearest time in test)
DT[, value := test[DT, value, roll = "nearest", on = .(id, time) ] ]

#     id time value
#  1:  A 1995     2
#  2:  A 1996     2
#  3:  A 1997     2
#  4:  A 1998     2
#  5:  A 1999     2
#  6:  A 2000     2
#  7:  A 2001     1
#  8:  A 2002    10
#  9:  B 1995     4
# 10:  B 1996     4
# 11:  B 1997     4
# 12:  B 1998     4
# 13:  B 1999     4
# 14:  B 2000     4
# 15:  B 2001     3
# 16:  B 2002     6

Upvotes: 3

Related Questions