jpinelo
jpinelo

Reputation: 1454

Read csv with timestamp to R. Define colClass in table.read

I'm trying to read a table (.CSV 120K x 21 wide) assigning object classes to columns with:

read.table(file = "G1to21jan2015.csv", 
           header = TRUE, 
           colClasses = c (rep("POSICXct", 6), 
                           rep("numeric", 2), 
                           rep("POSICXct", 2),  
                           "numeric", 
                           NULL, 
                           "numeric", 
                           NULL, 
                           rep("character", 2), 
                           rep("numeric", 5))
)

I get the following error:

Error in read.table(file = "G1to21jan2015.csv", header = TRUE, colClasses = c(rep("POSICXct",  : 
  more columns than column names

I've confirmed that the csv has 21 columns and so (I believe) does my request.

by removing second argument header = TRUE, I get a different error though:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 1 did not have 19 elements

Note I'm using POSICXct to read data in format: 1/5/2015 15:00:00 where m/d/Y H:M, numeric to read data like 1559, NULL to columns which are empty and I want to skip and character for text

Upvotes: 1

Views: 1761

Answers (1)

jpinelo
jpinelo

Reputation: 1454

For an unconventional date-time format, one can import as character (step 1) and then coerce the column via strp (step 2)

step 1

df <- read.table(file = "data.csv",
                        header = TRUE,
                        sep = "," ,
                        dec = "." ,
                        colClasses = "character",
                        comment.char = ""
                  )

step 2

strptime(df$v1, "%m/%d/%y  %H:%M")

v1 being the name of the column to coerce (in this case date-time in the unconventional format 12/13/2014 15:16:17)

Notes Using argument sep is necessary since read.table default for sep = "".
When using read.csv there is no need to use the sep argument, which defaults to ",".
Using comment.char = "" (when possible) improves reading time.
Useful info at http://cran.r-project.org/doc/manuals/r-release/R-data.pdf

Upvotes: 1

Related Questions