Reputation: 645
I have a data.table with a two-column key (id, date) and one or more columns of data. Some of the data might have missing values so I am using na.locf() from zoo to fill it in. I have noticed this operation changes the key in my data.table and I need to re-key it for subsequent joins. Why is this happening and in what other situations can I expect such behavior?
You can use the code below to reproduce the issue.
Thanks!
require(zoo)
d <- data.table(id = rep(1:2, each = 5), date = rep(1:5, 2), value = c(1,2,NA,NA,NA, 6,7,8,9,10))
setkey(d, id, date)
x <- d[, lapply(.SD, na.locf, na.rm = FALSE, maxgap = 1), by = 'id']
key(d)
key(x)
Upvotes: 0
Views: 144
Reputation: 52637
I think this does what you want:
x <- copy(d)
x[, (3:length(x)) := lapply(.SD, na.locf, maxgap = 1), by = 'id', .SDcols=3:length(x)]
key(x)
Results in:
[1] "id" "date"
And x
:
id date value
1: 1 1 1
2: 1 2 2
3: 1 3 1
4: 1 4 2
5: 1 5 1
6: 2 1 6
7: 2 2 7
8: 2 3 8
9: 2 4 9
10: 2 5 10
This assumes you don't need na.locf
to be applied on the date
column. Since you're not changing that column using :=
on the other columns preserves the key on the table.
Also, I had to change your use of na.locf
na.rm
to the default as otherwise that doesn't do anything.
Upvotes: 2