Reputation: 2471
I want to read a .csv file with columns for date and time into a data frame. This is my raw data:
10/05/18,0:00:03,SP-PALL-01,0.05
10/05/18,0:00:03,SP-PDALL-05,0.1
10/05/18,0:00:03,PT33-PT34,0.21249676
10/05/18,0:00:03,PT32-PT36,0.42838383
10/05/18,0:00:03,SP-PDAH-03,0.6
10/05/18,0:00:03,PT32-PT34,0.60584164
10/05/18,0:00:03,SP-PDAH-04,0.7
10/05/18,0:00:03,PT-31,2.4700246
10/05/18,0:00:03,PT32-PT31,12.394566
10/05/18,0:00:03,PT-32,14.782079
I used a variation of this answer:
setClass('tagDate')
setClass('tagTime')
setAs("character","tagDate", function(from) as.Date(from, format="%d/%m/%y"))
setAs("character","tagTime", function(from) as.POSIXct(from, format="%H:%M:%S"))
df <- read.csv('allTags-cleanup-copy.csv',header = FALSE, colClasses=c('tagDate','tagTime','character','real'))
names(df) <- c('tagDate', 'tagTime', 'tagName', 'tagValue')
head(df)
This is my result:
tagDate tagTime tagName tagValue
1 2018-05-10 2018-05-10 00:00:03 SP-PALL-01 0.0500000
2 2018-05-10 2018-05-10 00:00:03 SP-PDALL-05 0.1000000
3 2018-05-10 2018-05-10 00:00:03 PT33-PT34 0.2124968
4 2018-05-10 2018-05-10 00:00:03 PT32-PT36 0.4283838
5 2018-05-10 2018-05-10 00:00:03 SP-PDAH-03 0.6000000
6 2018-05-10 2018-05-10 00:00:03 PT32-PT34 0.6058416
For the time column, I tried variations of separator and format strings, but no luck.
Can you please help me parse this data set successfully?
Upvotes: 1
Views: 78
Reputation: 20095
Probably using hms
package is an option to convert time
column from character
and store as difftime
.
library(hms)
setClass('tagDate')
setClass('tagTime')
setAs("character","tagDate", function(from) as.Date(from, format="%d/%m/%y"))
setAs("character","tagTime", function(from) as.hms(from)) #Changing to difftime
df <- read.csv(textConnection(text),header = FALSE,
colClasses=c('tagDate','tagTime','character','real'))
names(df) <- c('tagDate', 'tagTime', 'tagName', 'tagValue')
head(df)
# tagDate tagTime tagName tagValue
# 1 2018-05-10 00:00:03 SP-PALL-01 0.0500000
# 2 2018-05-10 00:00:03 SP-PDALL-05 0.1000000
# 3 2018-05-10 00:00:03 PT33-PT34 0.2124968
# 4 2018-05-10 00:00:03 PT32-PT36 0.4283838
# 5 2018-05-10 00:00:03 SP-PDAH-03 0.6000000
# 6 2018-05-10 00:00:03 PT32-PT34 0.6058416
# 7 2018-05-10 00:00:03 SP-PDAH-04 0.7000000
# 8 2018-05-10 00:00:03 PT-31 2.4700246
# 9 2018-05-10 00:00:03 PT32-PT31 12.3945660
# 10 2018-05-10 00:00:03 PT-32 14.7820790
Data:
text <- "10/05/18,0:00:03,SP-PALL-01,0.05
10/05/18,0:00:03,SP-PDALL-05,0.1
10/05/18,0:00:03,PT33-PT34,0.21249676
10/05/18,0:00:03,PT32-PT36,0.42838383
10/05/18,0:00:03,SP-PDAH-03,0.6
10/05/18,0:00:03,PT32-PT34,0.60584164
10/05/18,0:00:03,SP-PDAH-04,0.7
10/05/18,0:00:03,PT-31,2.4700246
10/05/18,0:00:03,PT32-PT31,12.394566
10/05/18,0:00:03,PT-32,14.782079"
Upvotes: 1
Reputation: 4328
The lubridate
library relieves you of any need to do setClass
in advance. This is a shortcut: first read the df
with colClasses
as all characters. Then
library(tidyverse) ## For %>% pipes
library(magrittr) ## For %<>% pipes (convenience)
library(lubridate) ## For parsing dates
df %<>%
dplyr::mutate(
tagDate = as.Date(parse_date_time(tagDate, 'mdy')),
tagTime = format(parse_date_time(tagTime, 'HMS'),
format = '%H:%M:%S')
)
Basically, you just change the format of the display for tagTime
.
Upvotes: 1