Avi
Avi

Reputation: 2283

Dataframe datetime value row filling

I have a CSV file that contain the following:

ts1<-read.table(header = TRUE, sep=",", text="
   start,            end,            value
1,26/11/2014 13:00,26/11/2014 20:00,decreasing
2,26/11/2014 20:00,27/11/2014 09:00,increasing ")

I would like to transfer the above dataframe to a dataframe in which each row time column is opened and filled in with the value. The time gap is filled in from the start time to the end time - 1 (minus 1), as followed:

     date       hour        value
1   26/11/2014  13:00   decreasing
2   26/11/2014  14:00   decreasing
3   26/11/2014  15:00   decreasing
4   26/11/2014  16:00   decreasing
5   26/11/2014  17:00   decreasing
6   26/11/2014  18:00   decreasing
7   26/11/2014  19:00   decreasing
8   26/11/2014  20:00   increasing
9   26/11/2014  21:00   increasing
10  26/11/2014  22:00   increasing
11  26/11/2014  23:00   increasing
12  26/11/2014  00:00   increasing
13  26/11/2014  01:00   increasing
14  26/11/2014  02:00   increasing
15  26/11/2014  03:00   increasing
16  26/11/2014  04:00   increasing
17  26/11/2014  05:00   increasing
18  26/11/2014  06:00   increasing
19  26/11/2014  07:00   increasing
20  26/11/2014  08:00   increasing

I tried to start with separating the hours from the dates:

> t <- strftime(ts1$end, format="%H:%M:%S")
> t
[1] "00:00:00" "00:00:00"

Upvotes: 1

Views: 35

Answers (2)

Richard Telford
Richard Telford

Reputation: 9923

Here is a solution using lubridate and plyr. It processes each row of the data to make a sequence from the start to the end, and returns this with the value. Results from each row are combined into one data.frame. If you need to process the results further, you might be better off not separating the datetime into date and time

library(plyr)
library(lubridate)
ts1$start <- dmy_hm(ts1$start)
ts1$end <- dmy_hm(ts1$end)

adply(.data = ts1, .margin  = 1, .fun = function(x){
  datetime <- seq(x$start, x$end, by = "hour")
  #data.frame(datetime, value = x$value)"
  data.frame(date = as.Date(datetime), time = format(datetime, "%H:%M"), value = x$value)
})[, -(1:2)]

Upvotes: 1

akrun
akrun

Reputation: 887621

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(ts1)), grouped by the sequence of rows (1:nrow(ts1)), we convert the 'start' and 'end' columns to datetime class (using dmy_hm from lubridate), get the sequence by '1 hour', format the result to expected format, then split by space (tstrsplit), concatenate with the 'value' column, remove the 'rn' column by assigning to NULL. Finally, we can change the column names (if needed).

library(lubridate)
library(data.table)
res <- setDT(ts1)[,{st <- dmy_hm(start)
                    et <- dmy_hm(end)
                    c(tstrsplit(format(head(seq(st, et, by = "1 hour"),-1),
                          "%d/%m/%Y %H:%M"), "\\s+"), as.character(value))} ,
        by =  .(rn=1:nrow(ts1))
   ][, rn := NULL][]
setnames(res, c("date", "hour", "value"))[]
#          date  hour       value
# 1: 26/11/2014 13:00  decreasing
# 2: 26/11/2014 14:00  decreasing
# 3: 26/11/2014 15:00  decreasing
# 4: 26/11/2014 16:00  decreasing
# 5: 26/11/2014 17:00  decreasing
# 6: 26/11/2014 18:00  decreasing
# 7: 26/11/2014 19:00  decreasing
# 8: 26/11/2014 20:00 increasing 
# 9: 26/11/2014 21:00 increasing 
#10: 26/11/2014 22:00 increasing 
#11: 26/11/2014 23:00 increasing 
#12: 27/11/2014 00:00 increasing 
#13: 27/11/2014 01:00 increasing 
#14: 27/11/2014 02:00 increasing 
#15: 27/11/2014 03:00 increasing 
#16: 27/11/2014 04:00 increasing 
#17: 27/11/2014 05:00 increasing 
#18: 27/11/2014 06:00 increasing 
#19: 27/11/2014 07:00 increasing 
#20: 27/11/2014 08:00 increasing 

Upvotes: 1

Related Questions