Reputation: 1641
I have a dataframe that looks like this:
value id
1 2 A
2 5 A
3 NA A
4 7 A
5 9 A
6 1 B
7 NA B
8 NA B
9 5 B
10 6 B
And I would like to calculate growth rates of the value using the id variable to group. Usually, I would do something like this:
df <- df %>% group_by(id) %>% mutate(growth = log(value) - as.numeric(lag(value)))
To get this dataframe:
value id growth
(dbl) (chr) (dbl)
1 2 A NA
2 5 A -0.3905621
3 NA A NA
4 7 A NA
5 9 A -4.8027754
6 1 B NA
7 NA B NA
8 NA B NA
9 5 B NA
10 6 B -3.2082405
Now what I want to do is to use the last non NA value as well for the growth rates. Kind of like calculating the growth rates over the "NA-gaps" as well. For example: In row 4 should be the growth rate from 5 to 7 and in row 9 should be the growth rate from 1 to 5.
Thanks!
Upvotes: 1
Views: 268
Reputation: 887048
We can use fill
from tidyverse
library(tidyverse)
df %>%
group_by(id) %>%
fill(value) %>%
mutate(growth = log(value) - lag(value))
Upvotes: 2
Reputation: 140
zoo::na.locf
will replace NAs with the last non-NA value, so this may work for you:
df <- df %>%
group_by(id) %>%
mutate(
valuenoNA = zoo::na.locf(value),
growth = log(valuenoNA) - as.numeric(lag(valuenoNA)))
1 2 A NA 2
2 5 A -0.3905621 5
3 NA A -3.3905621 5
4 7 A -3.0540899 7
5 9 A -4.8027754 9
6 1 B NA 1
7 NA B -1.0000000 1
8 NA B -1.0000000 1
9 5 B 0.6094379 5
10 6 B -3.2082405 6
Upvotes: 2