Dave
Dave

Reputation: 63

readr::read_csv() doesn't read dates and returns NA

I have a csv file test.csv with a column containing dates :

V1
14-01-02 9:10
14-01-02 9:10
14-01-02 9:21
14-01-02 9:34
14-01-02 9:34
14-01-02 9:34

Reading the file with readr::read_csv yields NAs:

V1
1 <NA>
2 <NA>
3 <NA>
4 <NA>
5 <NA>
6 <NA>
Warning message:
9 problems parsing 'test.csv'. See problems(...) for more details. 

read.csv seems to be able to load it like this without a problem but it is too slow. The actual table is 322,509 x 45 and I'd rather not specify each column type with the col_typeoption.

Anyway it could load the column as a character?

Upvotes: 3

Views: 3392

Answers (2)

phiver
phiver

Reputation: 23608

You can specify column types in a list where you only name those columns where you do not want readr to try and recognize the column type.

read_csv("test.csv", col_types = list(V1 = col_datetime()))

See also the readme on cran for more details.

Upvotes: 6

Rorschach
Rorschach

Reputation: 32426

From ?read_csv it says of the col_type argument,

If 'NULL', the column type will be imputed from the first 30 rows on the input. This is convenient (and fast), but not robust. If the imputation fails, you'll need to supply the correct types yourself.

It sounds like you might be stuck with

read_csv("temp.csv", col_types="T")  # T for datetimes

You could also try reading the first line with read.csv, getting classes, then reading the whole file with read_csv. You would need to convert character to datetime after the fact.

samp <- read.csv("test.csv", nrows=1, strings=F)               # read one row
cols <- sapply(samp, class)                                    # get classes
key <- c("character"="c", "integer"="i", "logical"="l")        # make key, etc.
read_csv("test.csv", col_types=paste(key[cols], collapse=""))  # read with read_csv

Upvotes: 1

Related Questions