Reputation: 35
I am currently translating a model from Excel into R, but have come stuck when trying to write an IF statement which is partly based on the previous value within the same column. With my limited R knowledge and having searched the SO site, I can't find any previous examples of similar issues to work from.
The model is following a temperature profile highlighting notable events. Each time the temperature increases or decreases the event model returns a 1 or 0 accordingly. I have previously copied these types of models by subsetting in R, rather than nesting the IF statements as in excel, which has worked well. But I now have 2 problems which are:
rad_temp
reading is <25 then the column starts with a 0, otherwise it starts with a 1.The running values (event formula
: 2nd row-onwards) are partly based on formula referring to the previous value in the "event formula" column, which I am not sure how I would refer to in my R code (see image of excel screen) problem2. The formula in excel (using reference to the R df below) is -
IF((IF((cooling-cooling_minus1)=1,0)+IF((warming-warming_minus1)=1,1)+IF((cooling-cooling_minus1)=0, "same as previous value in "event formula""))=2,1,(IF((warming-warming_minus1)=1,1)+IF((cooling-cooling_minus1)=0,same as previous value in event formula)))
Thus, within my R code, I would like to create a new column event_formula
which replicates the excel formula above.
I have created a basic example of my dataset from which I am working and included it below.
library(tidyverse)
so_example <- structure(list(date_time = structure(c(1510917600, 1510917900,
1510918200, 1510918500, 1510918800, 1510919100, 1510919400, 1510919700,
1510920000, 1510920300, 1510920600, 1510920900, 1510921200, 1510921500,
1510921800, 1510922100, 1510922400, 1510922700, 1510923000, 1510923300,
1510923600, 1510923900), class = c("POSIXct", "POSIXt"), tzone = ""),
date = structure(c(17487, 17487, 17487, 17487, 17487, 17487,
17487, 17487, 17487, 17487, 17487, 17487, 17487, 17487, 17487,
17487, 17487, 17487, 17487, 17487, 17487, 17487), class = "Date"),
time = structure(c(40800, 41100, 41400, 41700, 42000, 42300,
42600, 42900, 43200, 43500, 43800, 44100, 44400, 44700, 45000,
45300, 45600, 45900, 46200, 46500, 46800, 47100), class = c("hms",
"difftime"), units = "secs"), rad_temp = c(17.65, 17.67,
17.67, 17.7, 17.72, 17.73, 17.75, 17.74, 17.74, 17.74, 17.72,
19.41, 26.45, 32.26, 35.88, 38.53, 40.37, 41.58, 42.31, 42.74,
42.96, 43.14), cooling = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), cooling_minus1 = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0), warming = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
1, 1, 1, 1, 0, 0, 0, 0, 0), warming_minus1 = c(0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0)), row.names = c(NA,
-22L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("date_time",
"date", "time", "rad_temp", "cooling", "cooling_minus1", "warming",
"warming_minus1"))
so_example %>%
mutate(time = as.character(time))
Based on the models that I have created to-date, I would try to code this by subsetting - as I have previously with formula that doesn't refer to the previous value in the column that I am creating.
In my head I would write it as something like the following (although this does not cover my first problem of initialising the model - problem1), and I might be way off..
# Event model
so_example$event_formula <- 0
# Rule 1
so_example$event_formula[so_example$cooling-so_example$cooling_minus1==1] <- 0
# Rule 2
so_example$event_formula[so_example$warming-so_example$warming_minus1==1] <- 1
# Rule 3
so_example$event_formula[so_example$cooling-so_example$cooling_minus1==0] <- #ref to same as previous value
# Rule 3
so_example$event_formula[(so_example$cooling-so_example$cooling_minus1==0) & (so_example$warming-so_example$warming_minus1==1) & (so_example$cooling-so_example$cooling_minus1==0)] <- #ref to same as previous value
Ultimately, the result that i would expect to see from the event_formula column would be the same as below:
so_example$expected_result <- c(0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1)
Please let me know if more information is needed for anyone to assist!
Upvotes: 1
Views: 127
Reputation: 35
I have found that the code below correctly answered the problem. I ended up using a for loop with nested ifelse statements. I know it's not the FP that R is best used for, but it replicates my excel functionality quite well.
so_example$event_formula <- NA
for (i in 1:nrow(so_example)){
ifelse(i==1,
ifelse(so_example[[i,"rad_temp"]]>25, x <- 1, x <- 0),
ifelse((so_example[[i,"cooling"]]-so_example[[i-1,"cooling"]])==1, x <- 0,
ifelse((so_example[[i,"warming"]]-so_example[[i-1,"warming"]])==1, x <- 1,
ifelse((so_example[[i,"cooling"]]-so_example[[i-1,"cooling"]])==0, x <- (so_example[[i-1,"event_formula"]]),"FALSE"
))))
so_example[[i,"event_formula"]] <- x
}
Upvotes: 1