Reputation: 2283
I have a CSV file that contain the following:
ts1<-read.table(header = TRUE, sep=",", text="
start, end, value
1,26/11/2014 13:00,26/11/2014 20:00,decreasing
2,26/11/2014 20:00,27/11/2014 09:00,increasing ")
I would like to transfer the above dataframe
to a dataframe
in which each row time
column is opened and filled in with the value. The time gap is filled in from the start
time to the end
time - 1 (minus 1), as followed:
date hour value
1 26/11/2014 13:00 decreasing
2 26/11/2014 14:00 decreasing
3 26/11/2014 15:00 decreasing
4 26/11/2014 16:00 decreasing
5 26/11/2014 17:00 decreasing
6 26/11/2014 18:00 decreasing
7 26/11/2014 19:00 decreasing
8 26/11/2014 20:00 increasing
9 26/11/2014 21:00 increasing
10 26/11/2014 22:00 increasing
11 26/11/2014 23:00 increasing
12 26/11/2014 00:00 increasing
13 26/11/2014 01:00 increasing
14 26/11/2014 02:00 increasing
15 26/11/2014 03:00 increasing
16 26/11/2014 04:00 increasing
17 26/11/2014 05:00 increasing
18 26/11/2014 06:00 increasing
19 26/11/2014 07:00 increasing
20 26/11/2014 08:00 increasing
I tried to start with separating the hours from the dates:
> t <- strftime(ts1$end, format="%H:%M:%S")
> t
[1] "00:00:00" "00:00:00"
Upvotes: 1
Views: 35
Reputation: 9923
Here is a solution using lubridate and plyr. It processes each row of the data to make a sequence from the start to the end, and returns this with the value. Results from each row are combined into one data.frame. If you need to process the results further, you might be better off not separating the datetime into date and time
library(plyr)
library(lubridate)
ts1$start <- dmy_hm(ts1$start)
ts1$end <- dmy_hm(ts1$end)
adply(.data = ts1, .margin = 1, .fun = function(x){
datetime <- seq(x$start, x$end, by = "hour")
#data.frame(datetime, value = x$value)"
data.frame(date = as.Date(datetime), time = format(datetime, "%H:%M"), value = x$value)
})[, -(1:2)]
Upvotes: 1
Reputation: 887621
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(ts1)
), grouped by the sequence of rows (1:nrow(ts1)
), we convert the 'start' and 'end' columns to datetime class (using dmy_hm
from lubridate
), get the sequence by
'1 hour', format
the result to expected format, then split by space (tstrsplit
), concatenate with the 'value' column, remove the 'rn' column by assigning to NULL
. Finally, we can change the column names (if needed).
library(lubridate)
library(data.table)
res <- setDT(ts1)[,{st <- dmy_hm(start)
et <- dmy_hm(end)
c(tstrsplit(format(head(seq(st, et, by = "1 hour"),-1),
"%d/%m/%Y %H:%M"), "\\s+"), as.character(value))} ,
by = .(rn=1:nrow(ts1))
][, rn := NULL][]
setnames(res, c("date", "hour", "value"))[]
# date hour value
# 1: 26/11/2014 13:00 decreasing
# 2: 26/11/2014 14:00 decreasing
# 3: 26/11/2014 15:00 decreasing
# 4: 26/11/2014 16:00 decreasing
# 5: 26/11/2014 17:00 decreasing
# 6: 26/11/2014 18:00 decreasing
# 7: 26/11/2014 19:00 decreasing
# 8: 26/11/2014 20:00 increasing
# 9: 26/11/2014 21:00 increasing
#10: 26/11/2014 22:00 increasing
#11: 26/11/2014 23:00 increasing
#12: 27/11/2014 00:00 increasing
#13: 27/11/2014 01:00 increasing
#14: 27/11/2014 02:00 increasing
#15: 27/11/2014 03:00 increasing
#16: 27/11/2014 04:00 increasing
#17: 27/11/2014 05:00 increasing
#18: 27/11/2014 06:00 increasing
#19: 27/11/2014 07:00 increasing
#20: 27/11/2014 08:00 increasing
Upvotes: 1