MSNavin
MSNavin

Reputation: 13

Convert hour to date-time

I have a data frame with hour stamp and corresponding temperature measured. The measurements are taken at random intervals over time continuously. I would like to convert the hours to respective date-time and temperature measured. My data frame looks like this: (The measurement started at 20/05/2016)

Time, Temp
09.25,28
10.35,28.2
18.25,29
23.50,30
01.10,31
12.00,36
02.00,25

I would like to create a data.frame with respective date-time and Temp like below:

Time, Temp
2016-05-20 09:25,28
2016-05-20 10:35,28.2
2016-05-20 18:25,29
2016-05-20 23:50,30
2016-05-21 01:10,31
2016-05-21 12:00,36
2016-05-22 02:00,25

I am thankful for any comments and tips on the packages or functions in R, I can have a look to do this. Thanks for your time.

Upvotes: 1

Views: 1825

Answers (2)

Jaap
Jaap

Reputation: 83275

A possible solution in base R:

df$Time <- as.POSIXct(strptime(paste('2016-05-20', sprintf('%05.2f',df$Time)), format = '%Y-%m-%d %H.%M', tz = 'GMT'))
df$Time <- df$Time + cumsum(c(0,diff(df$Time)) < 0) * 86400  # 86400 = 60 * 60 * 24

which gives:

> df
                 Time Temp
1 2016-05-20 09:25:00 28.0
2 2016-05-20 10:35:00 28.2
3 2016-05-20 18:25:00 29.0
4 2016-05-20 23:50:00 30.0
5 2016-05-21 01:10:00 31.0
6 2016-05-21 12:00:00 36.0
7 2016-05-22 02:00:00 25.0

An alternative with data.table (off course you can also use cumsum with diff instead of rleid & shift):

setDT(df)[, Time := as.POSIXct(strptime(paste('2016-05-20', sprintf('%05.2f',Time)), format = '%Y-%m-%d %H.%M', tz = 'GMT')) +
            (rleid(Time < shift(Time, fill = Time[1]))-1) * 86400]

Or with dplyr:

library(dplyr)
df %>%
  mutate(Time = as.POSIXct(strptime(paste('2016-05-20', 
                                          sprintf('%05.2f',Time)), 
                                    format = '%Y-%m-%d %H.%M', tz = 'GMT')) + 
           cumsum(c(0,diff(Time)) < 0)*86400)

which will both give the same result.


Used data:

df <- read.table(text='Time, Temp
09.25,28
10.35,28.2
18.25,29
23.50,30
01.10,31
12.00,36
02.00,25', header=TRUE, sep=',')

Upvotes: 1

teadotjay
teadotjay

Reputation: 1455

You can use a custom date format combined with some code that detects when a new day begins (assuming the first measurement takes place earlier in the day than the last measurement of the previous day).

# starting day
start_date = "2016-05-20"

values=read.csv('values.txt', colClasses=c("character",NA))
last=c(0,values$Time[1:nrow(values)-1])
day=cumsum(values$Time<last)

Time = strptime(paste(start_date,values$Time), "%Y-%m-%d %H.%M")
Time = Time + day*86400
values$Time = Time

Upvotes: 1

Related Questions