Reputation: 533
I have following dataframe:
df <- structure(list(DateTime = structure(c(1477978200, 1477980000, 1477981800, 1477983600, 1477985400, 1477987200, 1477989000, 1477990800, 1477992600, 1477994400, 1477996200, 1477998000, 1477999800, 1478001600, 1478003400, 1478005200, 1478007000, 1478008800, 1478010600, 1478012400, 1478014200, 1478016000, 1478017800, 1478019600, 1478021400, 1478023200, 1478025000, 1478026800, 1478028600, 1478030400, 1478032200, 1478034000, 1478035800, 1478037600, 1478039400, 1478041200, 1478043000, 1478044800, 1478046600, 1478048400, 1478050200, 1478052000, 1478053800, 1478055600, 1478057400, 1478059200, 1478061000, 1478062800, 1478064600, 1478066400, 1478068200, 1478070000, 1478071800, 1478073600, 1478075400, 1478077200, 1478079000, 1478080800, 1478082600, 1478084400, 1478086200, 1478088000, 1478089800, 1478091600, 1478093400, 1478095200, 1478097000, 1478098800, 1478100600, 1478102400, 1478104200, 1478106000, 1478107800, 1478109600, 1478111400, 1478113200, 1478115000, 1478116800, 1478118600, 1478120400), class = c("POSIXct", "POSIXt"), tzone = "America/Chicago"), Date = structure(c(17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17106, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107, 17107), class = "Date"), Rain_daily = c(8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), PET_daily = c(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), Mask = c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE)), row.names = c(NA, 80L), class = "data.frame")
I want to change values in column "Mask" to "FALSE" for two consecutive days if "Rain_daily" is greater than "PET_daily" for a particular day. For example, for the given dataframe, values in Mask column will be replaced with FALSE for both the days "2016-11-01" and "2016-11-02" (as 8>7).
How can we do this? I want to avoid for looping.
Upvotes: 0
Views: 124
Reputation: 42544
In a comment, the OP has mentioned My dataframe is really big having multiple columns.
Therefore, here is a data.table approach which assigns by reference only the Mask
column. This avoids to copy the whole dataset.
library(data.table)
setDT(df2)
df2[, Mask := !Date %in% CJ(.SD[Rain_daily > PET_daily, Date], 0:1, unique = TRUE)[, V1 + V2]]
df2
DateTime Date Rain_daily PET_daily Mask 1: 2015-01-01 00:00:00 2015-01-01 1 2 TRUE 2: 2015-01-01 00:30:00 2015-01-01 1 2 TRUE 3: 2015-01-01 01:00:00 2015-01-01 1 2 TRUE 4: 2015-01-02 00:00:00 2015-01-02 3 2 FALSE 5: 2015-01-02 00:30:00 2015-01-02 3 2 FALSE 6: 2015-01-02 01:00:00 2015-01-02 3 2 FALSE 7: 2015-01-03 00:00:00 2015-01-03 1 2 FALSE 8: 2015-01-03 00:30:00 2015-01-03 1 2 FALSE 9: 2015-01-03 01:00:00 2015-01-03 1 2 FALSE 10: 2015-01-04 00:00:00 2015-01-04 1 2 TRUE 11: 2015-01-04 00:30:00 2015-01-04 1 2 TRUE 12: 2015-01-04 01:00:00 2015-01-04 1 2 TRUE 13: 2015-01-05 00:00:00 2015-01-05 3 2 FALSE 14: 2015-01-05 00:30:00 2015-01-05 3 2 FALSE 15: 2015-01-05 01:00:00 2015-01-05 3 2 FALSE 16: 2015-01-06 00:00:00 2015-01-06 4 2 FALSE 17: 2015-01-06 00:30:00 2015-01-06 4 2 FALSE 18: 2015-01-06 01:00:00 2015-01-06 4 2 FALSE 19: 2015-01-07 00:00:00 2015-01-07 1 2 FALSE 20: 2015-01-07 00:30:00 2015-01-07 1 2 FALSE 21: 2015-01-07 01:00:00 2015-01-07 1 2 FALSE 22: 2015-01-08 00:00:00 2015-01-08 1 2 TRUE 23: 2015-01-08 00:30:00 2015-01-08 1 2 TRUE 24: 2015-01-08 01:00:00 2015-01-08 1 2 TRUE DateTime Date Rain_daily PET_daily Mask
Note that a new sample dataset has been used here which includes different days to demonstrate different use cases, see section below.
As a reproducible example only three timestamps per day but at 8 different days are simulated.
library(data.table)
df2 <- fread("
Date Rain_daily PET_daily
2015-01-01 1 2
2015-01-02 3 2
2015-01-03 1 2
2015-01-04 1 2
2015-01-05 3 2
2015-01-06 4 2
2015-01-07 1 2
2015-01-08 1 2
")[CJ(Date, DateTime = as.ITime(60 * 30 * 0:2))[
, DateTime := as.POSIXct(Date) + DateTime], on = "Date"][
, setcolorder(.SD, "DateTime")]
df2
DateTime Date Rain_daily PET_daily 1: 2015-01-01 00:00:00 2015-01-01 1 2 2: 2015-01-01 00:30:00 2015-01-01 1 2 3: 2015-01-01 01:00:00 2015-01-01 1 2 4: 2015-01-02 00:00:00 2015-01-02 3 2 5: 2015-01-02 00:30:00 2015-01-02 3 2 6: 2015-01-02 01:00:00 2015-01-02 3 2 7: 2015-01-03 00:00:00 2015-01-03 1 2 8: 2015-01-03 00:30:00 2015-01-03 1 2 9: 2015-01-03 01:00:00 2015-01-03 1 2 10: 2015-01-04 00:00:00 2015-01-04 1 2 11: 2015-01-04 00:30:00 2015-01-04 1 2 12: 2015-01-04 01:00:00 2015-01-04 1 2 13: 2015-01-05 00:00:00 2015-01-05 3 2 14: 2015-01-05 00:30:00 2015-01-05 3 2 15: 2015-01-05 01:00:00 2015-01-05 3 2 16: 2015-01-06 00:00:00 2015-01-06 4 2 17: 2015-01-06 00:30:00 2015-01-06 4 2 18: 2015-01-06 01:00:00 2015-01-06 4 2 19: 2015-01-07 00:00:00 2015-01-07 1 2 20: 2015-01-07 00:30:00 2015-01-07 1 2 21: 2015-01-07 01:00:00 2015-01-07 1 2 22: 2015-01-08 00:00:00 2015-01-08 1 2 23: 2015-01-08 00:30:00 2015-01-08 1 2 24: 2015-01-08 01:00:00 2015-01-08 1 2 DateTime Date Rain_daily PET_daily
Upvotes: 1
Reputation: 1241
using case_when
from dplyr
.
if date meets the condition Rain_daily > PET_daily
, chenge Mask
for these (df[Rain_daily > PET_daily, Date]
) and next (df[Rain_daily > PET_daily, Date]+1
) days. elsewhere leave Mask
as it is (TRUE ~ Mask
)
library(data.table)
library(dplyr)
df <- df %>% as.data.table() # to simlify operations on df
df <- df %>%
mutate(Mask = case_when(Date %in% c(df[Rain_daily > PET_daily, Date],df[Rain_daily > PET_daily, Date]+1) ~ FALSE,
TRUE ~Mask ))
Hope that helps
Upvotes: 1