Harshit Nagar
Harshit Nagar

Reputation: 438

Read csv file using read.csv() without losing milliseconds

I have a csv file with a timestamp column. The timestamps are in the format %Y-%m-%d %H:%M:%OS4 that is there is a milliseconds value also of 4 digits. When i read this csv using read.csv() I do not get the milliseconds but only till seconds in character format. How can I read the milliseconds also ?
Edit to add requires data and code:
mtc_data = read.csv(path/to/csv)
Notepad.pw link to data

Upvotes: 4

Views: 1004

Answers (2)

phhu
phhu

Reputation: 1952

If you specify "POSIXct" as the colClass for the datetime column when calling read.csv you preserve the time information, including milliseconds, as shown below.

# my_options <- options(digits.secs = 4) 
df <- read.csv(
  "data.csv"
  ,colClasses = c("POSIXct","factor")
  ,na.strings = c("")
)

print(
  format(df$timestamp[2], '%Y-%m-%d %H:%M:%OS4') 
) # "2018-11-20 00:00:05.0583"

Specifying options(digits.secs = 4) is helpful to have millisecond display to four digits, but is not necessary to preserve the information (in this case at least). It can be useful to specify na.strings to handle missing values too. Note too that annoyingly POSIXct doesn't seem to handle ISO standard dates with "T" separating the date and time by default: it will truncate the time information if it finds one, so if you have these you may need to replace the "T" with a space first.

See the read.csv docs here.

For reference, the CSV file is:

"timestamp","execution"
2018-11-20 00:00:00.0000,"STOPPED"
2018-11-20 00:00:05.0584,"STOPPED"
2018-11-20 00:00:07.5407,"RUNNING"

Upvotes: 0

jay.sf
jay.sf

Reputation: 72758

After reading in with read.csv (where you may want to use option stringsAsFactors=FALSE) use as.POSIXct with the format string you already have. The miliseconds are internally stored. Using strftime you can display the miliseconds, the variable is no longer "POSIXct" format then, but "character". It might be more safe to use trimws to get rid of unnecessary spaces after reading in.

dat <- read.csv("V:/R/_data/yourData.csv", stringsAsFactors=FALSE)
(x <- as.POSIXct(trimws(dat$timestamp), format="%Y-%m-%d %H:%M:%OS"))
# [1] "2018-11-20 00:00:00 CET" "2018-11-20 00:00:05 CET" "2018-11-20 00:00:07 CET"

x2 <- strftime(x, format="%Y-%m-%d %H:%M:%OS6")
x2
# [1] "2018-11-20 00:00:00.000000" "2018-11-20 00:00:05.058399" "2018-11-20 00:00:07.540699"

Upvotes: 1

Related Questions