Jebediah15
Jebediah15

Reputation: 794

R Date origin for formatting

I read this example about using the origin to set the date start time. I am reading in from csv where the dates are the number of days and read in as factors.

Example: $ admin_date    : Factor w/ 318 levels "37362","37735"

I would like to convert to date formats to measure number of days and months between events. But when I try all of the different origin such as:

df$admin_date_new <- as.numeric(df$admin_date)
df$admin_date_new <- as.Date(df$admin_date, origin="1900/01/01")

I get the following:

 $ admin_date_new: Date, format: "1900-08-31"

Is this something with finding the correct origin or is this when I am converting to numeric

Is there a quick way to find the origin necessary? I read help("as.Date") and all I got was:

## date given as number of days since 1900-01-01 (a date in 1989)
as.Date(32768, origin = "1900-01-01")
## Excel is said to use 1900-01-01 as day 1 (Windows default) or
## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel
## incorrectly treating 1900 as a leap year.
## So for dates (post-1901) from Windows Excel
as.Date(35981, origin = "1899-12-30") # 1998-07-05
## and Mac Excel
as.Date(34519, origin = "1904-01-01") # 1998-07-05
## (these values come from http://support.microsoft.com/kb/214330)

Upvotes: 1

Views: 6598

Answers (1)

Roland
Roland

Reputation: 132874

You don't give the expected result, but maybe this:

x <- factor(c("37362", "37735"))
y <- as.numeric(as.character(x))
as.Date(y, origin = "1900-01-01")
#[1] "2002-04-18" "2003-04-26"

If you don't know the origin, there is no way to find out just from your data.

Upvotes: 4

Related Questions