Reputation: 5270
I want to update a value of a column if it value changes. For example, in the following data, I would like to create column grp
based on value
column which is a binary variable signifying a change point. I tried to attempt it by creating temp1
but the result is not what I want.
library(tidyverse)
as_tibble(c(1,0,0,0,1,0,1,0)) %>%
mutate(temp1 = 1,
lag_temp1 = lag(temp1,1,default = 1),
temp1 = ifelse(row_number() ==1,1,value + lag_temp1)) %>%
mutate(grp = c(1,1,1,1,2,2,3,3)) %>%
print
# A tibble: 8 x 4
value temp1 lag_temp1 grp
<dbl> <dbl> <dbl> <dbl>
1 1 1 1 1
2 0 1 1 1
3 0 1 1 1
4 0 1 1 1
5 1 2 1 2
6 0 1 1 2
7 1 2 1 3
8 0 1 1 3
Apart from getting the grp
correctly, I am also seeking to know why my solution did not work. I have used similar logic in other places in my data analysis. It would be very beneficial for me to know where is the mistake? Apart from inbuilt cumsum
I may have to use other functions at times.
Upvotes: 0
Views: 462
Reputation: 26343
To get the grp
variable right we can use cumsum
library(tidyverse)
as_tibble(c(1, 0, 0, 0, 1, 0, 1, 0)) %>%
mutate(grp = cumsum(value))
# A tibble: 8 x 2
# value grp
# <dbl> <dbl>
#1 1 1
#2 0 1
#3 0 1
#4 0 1
#5 1 2
#6 0 2
#7 1 3
#8 0 3
In your solution there is no difference between temp1
and lag_temp1
in the first place:
as_tibble(c(1,0,0,0,1,0,1,0)) %>%
mutate(temp1 = 1,
lag_temp1 = lag(temp1, 1, default = 1))
So in the end temp1
is simply c(value[1], value[-1] + 1)
.
It is not entirely clear to me what is meant by "Apart from inbuilt cumsum
I may have to use other functions at times." - because this depends on the specific case. For the above example cumsum
does the job.
Upvotes: 1