user14030594
user14030594

Reputation:

Editing date and time in an R data frame

I'm trying to work on a .csv file of Water height and date. The date column comes in this format "2007-03-15T18:54:00Z". I've tried using regex to remove the 'T' and the 'Z' so I can manipulate the time for visualization but I keep getting NA in all of my entries.

df <- fread("./IrishNationalTideGalway.csv",select = c("time (UTC)","Water_Level_LAT (metres)"))

data <- df[c(918121:994130)] #2008-2009 subset of data

colnames(data)[1] <- "time"
colnames(data)[2] <- "height"

data$time <- as.POSIXct( data$time , format = "%Y/%m/%d %I:%M:%S" , tz = "GMT")

I'm unsure how to get rid of the T and Z and then also how to put it into a format that I can manipulate.

Upvotes: 0

Views: 137

Answers (2)

akrun
akrun

Reputation: 887138

We could convert to Datetime with lubridate and then apply the as.Date

library(dplyr)
df %>% 
    mutate(DATE_2 = as.Date(lubridate::ymd_hms(DATE_1)))

Upvotes: 2

Mike V
Mike V

Reputation: 1364

Here is a simple example to solve your problem

df <- data.frame(OBS = 1:2,DATE_1 = c("2007-03-15T18:54:00Z", "2008-03-15T18:54:00Z"))
 
df2 <- df %>% 
  mutate(DATE_2 = as.Date(stri_replace_all(DATE_1, regex = "T+(?:[01]\\d|2[0-3]):(?:[0-5]\\d):(?:[0-5]\\d)| (?:[01]\\d|2[0-3]):(?:[0-5]\\d):(?:[0-5]\\d)", "", perl = TRUE, ignore.case = TRUE)))
df2
# OBS               DATE_1     DATE_2
# 1   1 2007-03-15T18:54:00Z 2007-03-15
# 2   2 2008-03-15T18:54:00Z 2008-03-15

OR if you just want to remove T and Z only, please try this

df3 <- df %>% 
  mutate(DATE_2 = str_replace_all(DATE_1, regex("T|Z"), " ")) %>% 
  mutate(DATE_2 = str_trim(DATE_2, side = c("right")))
# OBS               DATE_1              DATE_2
# 1   1 2007-03-15T18:54:00Z 2007-03-15 18:54:00
# 2   2 2008-03-15T18:54:00Z 2008-03-15 18:54:00

Upvotes: 0

Related Questions