Efficient way to recode multiple date values in R

I have a quite large monthly database where the dates are recorded in a poorly way.

For instance, for January 2000, the value is "200001". So I have values ranging from "200001" to "200012". To make matters worse, each month is recorded in a different .csv file.

First, I loaded all .csv files together, creating a list I called "tbl". So tbl[[1]] returns the values for the month of January, for example. What I need is to encounter an efficient way to revalue "20000i" to "2000-01-0i", where i goes from 1 to 12, and then converting those values to date format.

What I've tried is:

for (i in length(tbl)) {
  if (i < 10) {
    tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("20000",i) = paste0("2000-01-0",i))) %>%  as.Date() -> tbl[[i]]$DATA
  } else {
    tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("2000",i) = paste0("2000-01-",i))) %>%  as.Date() -> tbl[[i]]$DATA
  }
}

This approach is not working and return the following error: Error: unexpected '=' in " tbl[[i]]$DATA %>% as.character() %>% revalue(c(paste0("2000",i) ="

Does anybody have a better idea?

EDIT: an example of my data

list(c("200001", "200001", "200001", "200001", "200001", "200001","200001", "200001", "200001", "200001", "200001", "200001"), 
c("200002", "200002", "200002", "200002", "200002", "200002", 
"200002", "200002", "200002", "200002", "200002", "200002"
), c("200003", "200003", "200003", "200003", "200003", "200003", 
"200003", "200003", "200003", "200003", "200003", "200003"
), c("200004", "200004", "200004", "200004", "200004", "200004", 
"200004", "200004", "200004", "200004", "200004", "200004"
), c("200005", "200005", "200005", "200005", "200005", "200005", 
"200005", "200005", "200005", "200005", "200005", "200005"
), c("200006", "200006", "200006", "200006", "200006", "200006", 
"200006", "200006", "200006", "200006", "200006", "200006"
), c("200007", "200007", "200007", "200007", "200007", "200007", 
"200007", "200007", "200007", "200007", "200007", "200007"
), c("200008", "200008", "200008", "200008", "200008", "200008", 
"200008", "200008", "200008", "200008", "200008", "200008"
), c("200009", "200009", "200009", "200009", "200009", "200009", 
"200009", "200009", "200009", "200009", "200009", "200009"
), c("200010", "200010", "200010", "200010", "200010", "200010", 
"200010", "200010", "200010", "200010", "200010", "200010"
), c("200011", "200011", "200011", "200011", "200011", "200011", 
"200011", "200011", "200011", "200011", "200011", "200011"
), c("200012", "200012", "200012", "200012", "200012", "200012", 
"200012", "200012", "200012", "200012", "200012", "200012"
))

Upvotes: 1

Views: 325

Answers (1)

Dave2e
Dave2e

Reputation: 24089

In order to convert your input into a date object you will need to add a day onto the yearmonth and then use the proper format:

for (i in 1:length(tbl)) {
   tbl[[i]]$DATA <- as.Date(paste(tbl[[i]]$DATA, 01), "%Y%m %d")
}

This will make every input the first day or the month. For just a dozen itens, a for loop is a quick enough.

Upvotes: 2

Related Questions