Nevs Bia
Nevs Bia

Reputation: 107

interval in months between two columns in r

I have this data

"code";"min";"max"
"CM106";2016-12-01;2018-08-01
"CM107";2017-10-01;2019-11-01
"CM109";2017-01-01;2019-02-01
"CM113";2018-02-01;2019-03-01
"CM114";2016-10-01;2017-12-01
"CM118";2018-04-01;2018-11-01
"CM121";2018-05-01;2020-02-01
"CM126";2018-08-01;2018-11-01
"CM129";2017-01-01;2018-04-01
"CM131";2018-09-01;2020-05-01
"CM144";2018-02-01;2019-11-01
"CM150";2018-10-01;2019-04-01
"CM153";2018-05-01;2018-09-01
"CM154";2016-05-01;2019-06-01

the format of the dates: year-month-day

I want to create a new column with the interval in months that exist between the "min" and "max" columns

I tried to follow this answer but didn't work Count the months between two dates in a data.table

I get this:

intervalos[, 2:3 := lapply(.SD, as.IDate, format = "%Y.%m.%d"), .SDcols = 2:3]

Error in [.tbl_df(intervalos, , :=(2:3, lapply(.SD, as.IDate, format = "%Y.%m.%d")), : unused argument (.SDcols = 2:3)

Upvotes: 1

Views: 243

Answers (1)

dario
dario

Reputation: 6483

1.Create reproducible minimal example

df <- structure(list(c = c("CM106", "CM107", "CM109", "CM113", "CM114", "CM118", "CM121", "CM126", "CM129", "CM131", "CM144", "CM150", "CM153", "CM154"), 
                     min = c("2016-12-01", "2017-10-01", "2017-01-01", "2018-02-01", "2016-10-01", "2018-04-01", "2018-05-01", "2018-08-01", "2017-01-01", "2018-09-01", "2018-02-01", "2018-10-01", "2018-05-01", "2016-05-01"),
                     max = c("2018-08-01", "2019-11-01", "2019-02-01", "2019-03-01", "2017-12-01", "2018-11-01", "2020-02-01", "2018-11-01", "2018-04-01", "2020-05-01", "2019-11-01", "2019-04-01", "2018-09-01", "2019-06-01")),
                class = "data.frame", row.names = c(NA, -14L))

2.Solution using base R:

Use as.Date

df$min <- as.Date(df$min, "%Y-%m-%d")
df$max <- as.Date(df$max, "%Y-%m-%d")

Calculate difference:

Calculate difference:

df$diff_days <- df$max - df$min
df$diff_months <- as.numeric(df$diff_days) /(365.25/12)

df$diff_days is now:

Time differences in days
 [1]  608  761  761  393  426  214
 [7]  641   92  455  608  638  182
[13]  123 1126

and df$diff_months is:

 [1] 19.975359 25.002053 25.002053 12.911704 13.995893
 [6]  7.030801 21.059548  3.022587 14.948665 19.975359
[11] 20.960986  5.979466  4.041068 36.993840

Upvotes: 4

Related Questions