Village.Idyot
Village.Idyot

Reputation: 2043

How to use the tidyr complete and fill functions to extend a column of NA's in R dataframe?

I'm using the tidyr complete() and fill() functions to extend (copy down) a dataframe so all ID elements have the same number of rows. The code posted at the bottom correctly extends all fields, with the exception of the "Bal2" column of the dataframe where a series of NA's should be extended. Any recommendations for how to correct this?

The NA values do serve a calculation purpose in the fuller code this is deployed in. Also please note that I have another code snippet for correctly extending the "Period_2" column so I don't need help with "Period_2". It's been omitted for code brevity.

The below illustrates the issue when generating the testDF and testDF1 dataframes:

enter image description here

Code:

library(dplyr)
library(tidyr)

testDF <-
  data.frame(
    ID = c(rep(1,5),rep(50,3),rep(60,3)),
    Period_1 = c(1:5,1:3,1:3),
    Period_2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-10","2012-11","2012-12"),
    Bal1 = c(rep(10,5),21:23,36:34),
    Bal2 = c(rep(12,8),rep(NA,3))
  )

testDF1 <- testDF %>%
  tidyr::complete(ID, nesting(Period_1)) %>%
  tidyr::fill(Bal1, Bal2, .direction = "down")
testDF1 <- as.data.frame(testDF1)

Upvotes: 0

Views: 1893

Answers (2)

r2evans
r2evans

Reputation: 160447

As mentioned in the comments, group by the ID and this should be resolved:

library(dplyr)
# library(tidyr)
testDF %>%
  tidyr::complete(ID, tidyr::nesting(Period_1)) %>%
  group_by(ID) %>%
  tidyr::fill(Bal1, Bal2, .direction = "down") %>%
  ungroup()
# # A tibble: 15 x 5
#       ID Period_1 Period_2  Bal1  Bal2
#    <dbl>    <int> <chr>    <dbl> <dbl>
#  1     1        1 2012-06     10    12
#  2     1        2 2012-07     10    12
#  3     1        3 2012-08     10    12
#  4     1        4 2012-09     10    12
#  5     1        5 2012-10     10    12
#  6    50        1 2013-06     21    12
#  7    50        2 2013-07     22    12
#  8    50        3 2013-08     23    12
#  9    50        4 NA          23    12
# 10    50        5 NA          23    12
# 11    60        1 2012-10     36    NA
# 12    60        2 2012-11     35    NA
# 13    60        3 2012-12     34    NA
# 14    60        4 NA          34    NA
# 15    60        5 NA          34    NA

Upvotes: 1

Captain Hat
Captain Hat

Reputation: 3237

Use dplry::group_by() before you fill():

require(dplyr)
#> Loading required package: dplyr
require(tidyr)
#> Loading required package: tidyr

test <- tribble(
  ~id,    ~value,
  "A",    80,
  "A",    NA,
  "A",    NA,
  "B",    NA,
  "B",    NA
)

fill(test, value)
#> # A tibble: 5 × 2
#>   id    value
#>   <chr> <dbl>
#> 1 A        80
#> 2 A        80
#> 3 A        80
#> 4 B        80
#> 5 B        80

test <- group_by(test, id)
fill(test, value)
#> # A tibble: 5 × 2
#> # Groups:   id [2]
#>   id    value
#>   <chr> <dbl>
#> 1 A        80
#> 2 A        80
#> 3 A        80
#> 4 B        NA
#> 5 B        NA

Created on 2022-11-23 with reprex v2.0.2

Upvotes: 0

Related Questions