Reputation: 2849
I am helping someone try to get to the solution they want without making too many changes to the code they came up with. I know that the for loop is not necessary. For example, you could solve it by adding datenumeric <- as.Date(datenumeric, "%Y%m%d")
to their convertdatereadable
function before passing it into lapply
. I am having trouble replicating the same results using a for loop.
Request
dat
has a date
column with the following double
values:
1947.01
1947.02
1947.03
1947.04
1947.05
The request is to convert the date
column into date format format = "%Y%m%d"
.
Reproducible example
dat <- structure(list(date = c(1947.01000976562, 1947.02001953125, 1947.03002929688,
1947.0400390625, 1947.05004882812), sp500 = c(15.210000038147,
15.8000001907349, 15.1599998474121, 14.6000003814697, 14.3400001525879
), divyld = c(4.48999977111816, 4.38000011444092, 4.6100001335144,
4.75, 5.05000019073486), i3 = c(0.379999995231628, 0.379999995231628,
0.379999995231628, 0.379999995231628, 0.379999995231628), ip = c(22.3999996185303,
22.5, 22.6000003814697, 22.5, 22.6000003814697), pcsp = c(NA,
46.5483322143555, -48.6076202392578, -44.3271369934082, -21.3698806762695
), rsp500 = c(NA, 50.9283332824707, -43.9976196289062, -39.5771369934082,
-16.319881439209), pcip = c(NA, 5.35716342926025, 5.33335399627686,
-5.30975437164307, 5.33335399627686), ci3 = c(NA, 0, 0, 0, 0),
ci3_1 = c(NA, NA, 0, 0, 0), ci3_2 = c(NA, NA, NA, 0, 0),
pcip_1 = c(NA, NA, 5.35716342926025, 5.33335399627686, -5.30975437164307
), pcip_2 = c(NA, NA, NA, 5.35716342926025, 5.33335399627686
), pcip_3 = c(NA, NA, NA, NA, 5.35716342926025), pcsp_1 = c(NA,
NA, 46.5483322143555, -48.6076202392578, -44.3271369934082
), pcsp_2 = c(NA, NA, NA, 46.5483322143555, -48.6076202392578
), pcsp_3 = c(NA, NA, NA, NA, 46.5483322143555), month = c(-156,
-155, -154, -153, -152)), row.names = c(NA, 5L), class = "data.frame")
Code that includes their convertdatereadable function
convertdatereadable <- function(datenumeric){
datenumeric <- trunc(datenumeric * 10000 + 1)
datenumeric <- as.character(datenumeric)
return(datenumeric)
}
dat[1] <- lapply(dat[1], convertdatereadable)
for (n in 1:nrow(dat)){
dat$date <- as.Date(dat[n, 1], format = "%Y%m%d")
}
The for loop in its current state outputs the correct format but is, unfortunately, replicating the first date for all 5 rows.
Incorrect current output
dat[1]
#> date
#> 1 1947-01-01
#> 2 1947-01-01
#> 3 1947-01-01
#> 4 1947-01-01
#> 5 1947-01-01
Desired output while keeping the for loop
dat[1]
#> date
#> 1 1947-01-01
#> 2 1947-02-01
#> 3 1947-03-01
#> 4 1947-04-01
#> 5 1947-05-01
I thought this would work, but it doesn't:
for (n in 1:nrow(dat)){
dat[n, 1] <- as.Date(dat[n, 1], format = "%Y%m%d")
}
Upvotes: 1
Views: 64
Reputation: 93813
As others have said, using as.Date(..., format="%Y%m%d")
is the way to do this rather than a loop.
But to understand what is going on here, break it down and check the status of the output after each line:
First, let's fix the loop to index both sides by n
so that each value is overwritten in turn:
for (n in 1:nrow(dat)){
dat$date[n] <- as.Date(dat$date[n], format = "%Y%m%d")
}
This results in a character representation of the number of days since 1970-01-01 (dates are stored in R as the numeric version of this):
dat$date
#[1] "-8401" "-8370" "-8342" "-8311" "-8281"
class(dat$date)
#[1] "character"
Why character and not numeric? Because you are using ]<-
not <-
, that is, you are not overwriting the whole dat$date
column, but each dat$date[1]
, dat$date[2]
etc. And that will keep the source class
in this case since numeric data can always be coerced to a character, but character data can't be coerced to a number necessarily. E.g.:
x <- c("a","b","c")
x[1] <- 1
x
#[1] "1" "b" "c"
x <- c(1,2,3)
x[1] <- "a"
x
#[1] "a" "2" "3"
If you overwrite the whole object though, the class will change:
x <- c("a","b","c")
x <- c(1,2,3)
x
#[1] 1 2 3
You then need to force the class back to date:
class(dat$date) <- "Date"
dat$date
#[1] "1947-01-01" "1947-02-01" "1947-03-01" "1947-04-01" "1947-05-01"
class(dat$date)
#[1] "Date"
You could also get the same result by converting explicitly:
dat$date <- as.Date(as.numeric(dat$date), origin="1970-01-01")
Upvotes: 1
Reputation: 390
You are almost done. You need to just change the variable in the loop as below:
for (n in 1:nrow(dat)){
dat$crcteddate <- as.Date(dat$date, format = "%Y%m%d")
}
This will create a column called 'crcteddate' and gives the following output:
"1947-01-01" "1947-02-01" "1947-03-01" "1947-04-01" "1947-05-01"
You have erroneously called the date column dat[n,1] instead of calling straight dat$date.
Upvotes: 1