Ricky
Ricky

Reputation: 2750

How to update the next row of the current row when a condition is met

I have a data table as below:

library(data.table)
library(lubridate)

dput(data)
structure(list(Id = c(1, 1, 1, 1), start = structure(c(1509525095, 
1509529535, 1509532655, 1509543455), class = c("POSIXct", "POSIXt"
), tzone = "NA"), end = structure(c(1509525450, 1509529535, 1509535650, 
1509549450), class = c("POSIXct", "POSIXt"), tzone = "NA"), spot = structure(c(1509524490, 
1509529235, 1509529715, 1509542250), class = c("POSIXct", "POSIXt"
), tzone = "NA"), type = structure(c(1L, 1L, 3L, 1L), .Label = c("1", 
"2", "3"), class = "factor"), consumption = structure(c(10.0833333333333, 
5, 49, 20.0833333333333), units = "mins", class = "difftime")), .Names = c("Id", 
"start", "end", "spot", "type", "consumption"), row.names = c(NA, 
-4L), class = c("data.table", "data.frame"))

From this I want to add a new column spot_new to the row after the row where the conditon start=end end is met.

I tried

  data[start=end, data:=c(NA, spot[-.N]), by=Id]

But this doesn't do what I wanted.Any help is appreciated.

Desired Output

Desired Output

Upvotes: 2

Views: 65

Answers (2)

akrun
akrun

Reputation: 886948

Here we get the row index with .I of the next row by adding 1 to it. To take care of edge cases where the last row of a group have 'start' and 'end' as equal, use the pmin to get the last row (not clear about what to do in that case though)

 i1 <- data[, .I[pmin(which(start == end)+1, .N)], Id]$V1
 data[i1, spot_new := spot][]
# Id               start                 end                spot type   consumption            spot_new
#1:  1 2017-11-01 08:31:35 2017-11-01 08:37:30 2017-11-01 08:21:30    1 10.08333 mins                <NA>
#2:  1 2017-11-01 09:45:35 2017-11-01 09:45:35 2017-11-01 09:40:35    1  5.00000 mins                <NA>
#3:  1 2017-11-01 10:37:35 2017-11-01 11:27:30 2017-11-01 09:48:35    3 49.00000 mins 2017-11-01 09:48:35
#4:  1 2017-11-01 13:37:35 2017-11-01 15:17:30 2017-11-01 13:17:30    1 20.08333 mins                <NA>

Upvotes: 1

Sotos
Sotos

Reputation: 51582

I can offer a dplyr solution which works with a rowwise, if else statement in order to fill the column with the spot. We then use lag to move it a position, i.e.

library(dplyr)

df %>% 
 group_by(Id) %>% 
 rowwise() %>% 
 mutate(spot_new = if(start == end){spot}else(NA)) %>% 
 ungroup() %>% 
 mutate(spot_new = lag(spot_new))

which gives

# A tibble: 4 x 7
     Id               start                 end                spot   type   consumption            spot_new
  <dbl>              <dttm>              <dttm>              <dttm> <fctr>        <time>              <dttm>
1     1 2017-11-01 08:31:35 2017-11-01 08:37:30 2017-11-01 08:21:30      1 10.08333 mins                  NA
2     1 2017-11-01 09:45:35 2017-11-01 09:45:35 2017-11-01 09:40:35      1  5.00000 mins                  NA
3     1 2017-11-01 10:37:35 2017-11-01 11:27:30 2017-11-01 09:48:35      3 49.00000 mins 2017-11-01 09:40:35
4     1 2017-11-01 13:37:35 2017-11-01 15:17:30 2017-11-01 13:17:30      1 20.08333 mins                  NA

Upvotes: 3

Related Questions