Justin M
Justin M

Reputation: 33

strptime error in R: input string is too long

I cannot seem to convert my data from a csv into a proper date class. I am using a csv of 1033 dates. I have saved the CSV in the format 'YYYYMMDD'

Here is my code for importing the csv (which seems to work):

bd <- read.csv('birthdaysExample.csv', 
           header = FALSE, 
           sep = ',')

I can see the data in R Studio:

> head(bd)
        V1
1 20141125
2 20140608
3 20140912
4 20140526
5 20140220
6 20140619

However, when I attempt to convert the dates I receive the error: "Error in strptime(bd, format = "%Y%m%d") : input string is too long."

Below is my code:

better_bds <- strptime(bd,format='%Y%m%d')

I even have tried to check and verify that all of my dates do in fact have 8 characters:

> table(nchar(bd$V1) != 8 | nchar(bd$V1) != 8)

FALSE 
1033

So I'm not sure where to turn next if anyone could point me in the right direction, I would appreciate it!

Upvotes: 3

Views: 8106

Answers (4)

tassones
tassones

Reputation: 1692

Here is a dplyr approach:

  1. Recreate the example
bd <- structure(list(V1 = c(20141125L, 20140608L, 20140912L, 20140526L,
                            20140220L, 20140619L)), .Names = "V1", class = "data.frame",
                row.names = c(NA, -6L))

as.character(bd)

bd
  1. Answer
library(dplyr)

better_bds <- bd %>%
  mutate_at('V1', as.numeric) %>%
  mutate(Date = as.Date(paste(V1, sep = "-"), "%Y%m%d"))

better_bds

Upvotes: 0

Joshua Ulrich
Joshua Ulrich

Reputation: 176648

The problem is that bd is a one-column data.frame and strptime expects a character vector. If you don't pass a character vector to strptime, it calls as.character(x) on whatever you pass in. Calling as.character(bd) results in something you probably do not expect.

bd <- structure(list(V1 = c(20141125L, 20140608L, 20140912L, 20140526L,
  20140220L, 20140619L)), .Names = "V1", class = "data.frame",
  row.names = c(NA, -6L))
as.character(bd)
# [1] "c(20141125, 20140608, 20140912, 20140526, 20140220, 20140619)"

You need to subset the character vector column of bd before passing it to strptime (as Hugh suggested in his comment).

strptime(bd[,1], format="%Y%m%d")

Also, since you do not appear to have any actual time information, I would suggest you use the Date class instead. That will prevent you from encountering any potential timezone issues.

as.Date(as.character(bd[,1]), format="%Y%m%d")

Upvotes: 5

RHertel
RHertel

Reputation: 23788

You could try with

better_bds <- sapply(bd,function(x) strptime(x,format='%Y%m%d'))

With your input data, I obtain

> better_bds
$V1
[1] "2014-11-25 CET"  "2014-06-08 CEST" "2014-09-12 CEST" "2014-05-26 CEST" "2014-02-20 CET"  "2014-06-19 CEST"

Upvotes: 0

pmr
pmr

Reputation: 1006

your actual date format have to be in sync with date format inside strptime function. example as below:

> x <- c("2006-01-08", "2006-08-07")
> strptime(x, "%Y-%m-%d")
[1] "2006-01-08" "2006-08-07"

> y <- c("2006/01/08", "2006/08/07")
> strptime(y, "%Y/%m/%d")
[1] "2006-01-08" "2006-08-07"

if you try different it will show error:

> x <- c("2006-01-08", "2006-08-07")
> strptime(x, "%Y/%m/%d")
[1] NA NA

> y <- c("2006/01/08", "2006/08/07")
> strptime(y, "%Y-%m-%d")
[1] NA NA

> x <- c("20060108", "20060807")
> strptime(x, "%Y-%m-%d")
[1] NA NA
> x <- c("20060108", "20060807")
> strptime(x, "%Y-%m-%d")
[1] NA NA

Hope this helps.

Upvotes: 0

Related Questions