amatof
amatof

Reputation: 185

generating a column of times

In my dataset, I want to include a column that contains only times. I have generated a column of random dates ranging from 2018 to 2020, but the time stamps don't appear to be generated as randomly throughout the day as I would like.

This is how I made the date/time column.

data$date <- sample(seq(as.POSIXct('2018/01/01'), as.POSIXct('2020/12/31'), by = "day"),
                    length(data$date), replace = TRUE)

and I am using it to get the times

data$time <- format(data$date, format = "%H:%M:%S")

but this is what it looks like

> dput(data[1:10,-c(5,6)])
structure(list(order_num = c(501073L, 969942L, 1091101L, 590143L, 
390404L, 219429L, 1025827L, 689629L, 694348L, 435848L), date = structure(c(1542344400, 
1552194000, 1550379600, 1534568400, 1523336400, 1563426000, 1595826000, 
1552712400, 1534309200, 1547960400), class = c("POSIXct", "POSIXt"
), tzone = ""), total_sale = c(36.3853391310075, 35.9405038506853, 
55.6254974332793, 47.7214780063544, 61.4086594373677, 32.8631076291332, 
33.3640439679803, 40.8944394660076, 54.9455495252506, 48.12597580998
), season = c("Spring", "Winter", "Winter", "Fall", "Fall", "Spring", 
"Summer", "Summer", "Fall", "Fall"), time = c("00:00:00", "00:00:00", 
"00:00:00", "01:00:00", "01:00:00", "01:00:00", "01:00:00", "01:00:00", 
"01:00:00", "00:00:00")), row.names = c(NA, 10L), class = "data.frame")

I am hoping for more random times throughout the day, such as 9:33:35, 14:56:43, and so on.

Upvotes: 1

Views: 170

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388972

You can generate random times using -

data$time <- format(as.POSIXct(sample(86400, nrow(data)), origin = '1970-01-01'), '%T')

This generates random numbers from 1 to 86400 (seconds in a day) changes it to POSIXct type and extracts only the time from it using format.

Upvotes: 4

neuron
neuron

Reputation: 2059

I think this function will help you generate random times throughout the day as you mentioned

randomtimes <- function(N, st="2018/01/01", et="2020/12/31") {
  st <- as.POSIXct(as.Date(st))
  et <- as.POSIXct(as.Date(et))
  dt <- as.numeric(difftime(et,st,unit="sec"))
  ev <- sort(runif(N, 0, dt))
  rt <- st + ev
}

Then you can just apply this to your data. Here nrow just counts the number of rows in your data and then uses that value to generate the number of dates. You could also just swap out nrow(data) with 10 since that is the number of rows in your data

data$date <- randomtimes(nrow(data))

Upvotes: 1

Related Questions