sk17
sk17

Reputation: 41

How do I assign a value in R if within a certain range of time?

I have a large data set that collects multiple data points each day from people over multiple days. My R dataset has participants' responses and the timestamp for their response. I want to recode the timestamp to reflect which order prompt they responded to. So basically, I want to assign a value to the timestamp based on a range of time. So if on Monday, a response falls between 10:00 and 10:30, I want the value to be 1. If a response falls between 12:15 and 12:45, I want the value to be 2. If a response falls between 2:20 and 2:50, I want the value to be 3.

BUT I need that code to work only for Monday's data. For Tuesday's data, the timestamp ranges changes. For example, if a Tuesday response falls between 9:10 and 9:40, that value should be 1. And so on.

I can't for the life of me how to figure this out with an if else statement. When I write time into R, it thinks I'm writing a code for a series of values (10 through 30) rather than time (10:30).

Example of what I have: enter image description here

Example of what I want: (see the new Prompt column) So for 10/11/15 I want Prompt 1 to fall between 11:15:00 and 11:45:00, but for 11/11/15 I want Prompt 1 to be different--between 12:00:00 and 12:30:00 enter image description here

Upvotes: 2

Views: 4131

Answers (3)

sk17
sk17

Reputation: 41

I ended up using some of both of those answers.

library(lubridate)

#change data to POSIXct class
data$StartDate <- dmy(as.character(data$StartDate))
data$EndDate <- dmy(as.character(data$EndDate))

data$StartTime2 <- hms(as.character(data$StartTime))
data$EndTime2 <- hms(as.character(data$Endataime))

I didn't have to do both, but I did anyway. I created an additional variable because changing it makes it look funny.

#check me out
class(data$StartDate)
#[1] "POSIXct" "POSIXt" 
 class(data$StartTime2)
#[1] "Period"
#attr(,"package")
#[1] "lubridate"

Based off the second comment I then did:

data$day = wday(data$StartDate, label=T)
data$hour = hour(data$StartTime2)
data$minute = minute(data$StartTime2)

# create a column with an arbitrary value to start with and also double     check in the end
data$prompt = -1

# conditions for Tuesday (10/11/2015) 
data$prompt[data$day=="Tues" & data$hour==11 & data$minute >= 10 & data$minute <=40] = 1
data$prompt[data$day=="Tues" & data$hour==13 & data$minute >= 35 & data$minute <=59] = 2
data$prompt[data$day=="Tues" & data$hour==16 & data$minute >= 15 & data$minute <=45] = 3

And so on. I know I have to fix the prompt 2 for this day because it goes into hour 14, but that's to play with next. Thanks for your help!

Upvotes: 0

AntoniosK
AntoniosK

Reputation: 16121

I have a simpler solution using days, hours and minutes and your (manual) filters which you can use as a function. Check my simple example:

 library(lubridate)

   # example dataset
   dt = data.frame(responce = 1:3,
                   date = c("2015-08-10 10:15:34","2015-08-10 12:29:14","2015-08-11 09:12:18"),
                      stringsAsFactors = F)

     dt

#   responce                date
#   1        1 2015-08-10 10:15:34
#   2        2 2015-08-10 12:29:14
#   3        3 2015-08-11 09:12:18


     # transform to date and obtain day, hour and minutes
   dt$date = ymd_hms(dt$date)
   dt$day = wday(dt$date, label=T)
   dt$hour = hour(dt$date)
   dt$minute = minute(dt$date)

     dt

#   responce                date  day hour minute
#   1        1 2015-08-10 10:15:34  Mon   10     15
#   2        2 2015-08-10 12:29:14  Mon   12     29
#   3        3 2015-08-11 09:12:18 Tues    9     12


     # create a column with an arbitrary value to start with and also double check in the end
   dt$value = -1

     # conditions for Monday
   dt$value[dt$day=="Mon" & dt$hour==10 & dt$minute >= 0 & dt$minute <=30] = 1
   dt$value[dt$day=="Mon" & dt$hour==12 & dt$minute >= 15 & dt$minute <=45] = 2
   dt$value[dt$day=="Mon" & dt$hour==14 & dt$minute >= 20 & dt$minute <=50] = 3

     # conditions for Tuesday
   dt$value[dt$day=="Tues" & dt$hour==9 & dt$minute >= 10 & dt$minute <=40] = 1

     dt

#   responce                date  day hour minute value
#   1        1 2015-08-10 10:15:34  Mon   10     15     1
#   2        2 2015-08-10 12:29:14  Mon   12     29     2
#   3        3 2015-08-11 09:12:18 Tues    9     12     1

     # double check all your rows matched (you have no -1 values)
   dt[dt$value == -1]

  # data frame with 0 columns and 3 rows

Upvotes: 1

mra68
mra68

Reputation: 2960

If you want to work with times and dates, the POSIXlt class is helpful. If your timestamps are stored as strings, the first step is to convert them into POSIXlt. You can use "strptime" for this, e.g.

> t <- strptime("2015-01-01 12:18",format="%Y-%m-%d %H:%M")
> t
[1] "2015-01-01 12:18:00 CET"
> class(t)
[1] "POSIXlt" "POSIXt" 
>

The following function "timerange" assigns a time range number to such a POSIXlt object:

R <- list( Sun = list(),
           Mon = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),                      
           Tue = list( c( "9:10", "9:40"), c("11:00","11:30"), c("13:15","13:40") ),
           Wed = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),                      
           Thu = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),                      
           Fri = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),                      
           Sat = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") )  )                      

timerange <- function(t)
{
  s <- unlist(strsplit(strftime(t,format="%Y-%m-%d %H:%M:%S %w")," "))  
  w <- as.numeric(s[3]) + 1  
  n <- sapply(R[[w]], function(x){ strptime(paste(s[1]," ",x,":00",sep=""),
                                            format="%Y-%m-%d %H:%M:%S")})  

  return( which(sapply(n,function(x){ t-x[1]>=0 & t-x[2]<=0})) )
}

"R" is the list of all time ranges. You can change it as you like. "strftime" is the counterpart to "strptime", i.e. it converts the POSIXlt object "t" into a string of a desired format. This string is then spitted into the date part, the time part, and the day of the week. The latter is used to pick the appropriate sublist in "R". Then "strptime" is used to create a list of pairs of POSIXlt objects. The time part comes from the appropriate sublist of "R", and the date part comes from "t". Each such pair represents a time interval. Then the time range number is the index of the time interval which contains "t".

Some examples:

> t <- strptime("2015-01-01 12:18",format="%Y-%m-%d %H:%M")
> timerange(t)
[1] 2
> t <- strptime("2015-01-05 10:01",format="%Y-%m-%d %H:%M")
> timerange(t)
[1] 1
> t <- strptime("05.01.2015 13:25",format="%d.%m.%Y %H:%M")
> timerange(t)
[1] 3

Upvotes: 2

Related Questions