Reputation: 388
Say that I have dataset like this:
Object date date_data
1: N <NA>
2: A <NA>
3: A <NA>
4: A 2020-01-01 something
5: B <NA>
6: B <NA>
7: B 2020-01-01 something
8: C <NA>
9: C <NA>
10: C 2020-01-01 something
Generated by
example <- data.table(Object = rep(LETTERS[1:3], each=3), date = as.Date(rep(c(NA,NA,"2020-01-01"),3)), date_data = rep(c("","","something")))
example <- rbind(list(Object="N", date=as.Date(NA), date_data = ""), example)
I know that I can fill in the missing dates for each group using nafill:
example[, date:= nafill(date,"nocb"), by=Object]
Object date date_data
1: N <NA>
2: A 2020-01-01
3: A 2020-01-01
4: A 2020-01-01 something
5: B 2020-01-01
6: B 2020-01-01
7: B 2020-01-01 something
8: C 2020-01-01
9: C 2020-01-01
10: C 2020-01-01 something
What I would like to do is to fill in the date_data column in the same way as the date column. I cannot do that with the datatable's nafill option as it doesn't work with character values.
What would be a simple workaround? I have seen similar questions, but didn't find any that refer to the same problem
Edit
@Ronak Shah's data.table works for the original example. But if I change it to
example <- data.table(Object = c(rep("A",6), rep("B",3)), date = as.Date(rep(c(NA,NA,"2020-01-01"),3)), date_data = rep(c("","","something")))
example <- rbind(list(Object="N", date=as.Date(NA), date_data = ""), example)
Object date date_data
1: N <NA>
2: A <NA>
3: A <NA>
4: A 2020-01-01 something
5: A <NA>
6: A <NA>
7: A 2020-01-01 something
8: B <NA>
9: B <NA>
10: B 2020-01-01 something
na.fill does not longer work, I get the error message: Supplied 4 items to be assigned to group 2 of size 6 in column 'date_data'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.
Upvotes: 2
Views: 2296
Reputation: 25225
Another option using data.table
applying to all types of columns at once:
#change empty string to NAs if there are more columns, use set
example[date_data=="", date_data := NA_character_]
cols <- c("date", "date_data")
example[, (cols) := lapply(.SD,
function(x) x[nafill(replace(seq.int(.N), is.na(x), NA_integer_), "nocb")]),
Object, .SDcols=cols]
Upvotes: 3
Reputation: 388817
You can replace empty values with NA
and use zoo::na.locf
.
library(data.table)
example[, date_data := zoo::na.locf(replace(date_data, date_data == "", NA)), Object]
example
# Object date date_data
# 1: N <NA>
# 2: A 2020-01-01 something
# 3: A 2020-01-01 something
# 4: A 2020-01-01 something
# 5: B 2020-01-01 something
# 6: B 2020-01-01 something
# 7: B 2020-01-01 something
# 8: C 2020-01-01 something
# 9: C 2020-01-01 something
#10: C 2020-01-01 something
and similarly using tidyr
's fill
:
library(dplyr)
example %>%
mutate(date_data = replace(date_data, date_data == "", NA)) %>%
group_by(Object) %>%
tidyr::fill(date_data, .direction = "up")
Upvotes: 3