Sergio Olalla
Sergio Olalla

Reputation: 99

Months to integer R

This is part of the dataframe I am working on. The first column represents the year, the second the month, and the third one the number of observations for that month of that year.

2005 07    2
2005 10    4
2005 12    2
2006 01    4
2006 02    1
2006 07    2
2006 08    1
2006 10    3

I have observations from 2000 to 2018. I would like to run a Kernel Regression on this data, so I need to create a continuum integer from a date class vector. For instance Jan 2000 would be 1, Jan 2001 would be 13, Jan 2002 would be 25 and so on. With that I will be able to run the Kernel. Later on, I need to translate that back (1 would be Jan 2000, 2 would be Feb 2000 and so on) to plot my model.

Upvotes: 8

Views: 390

Answers (3)

lmo
lmo

Reputation: 38500

Just use a little algebra:

df$cont <- (df$year - 2000L) * 12L + df$month

You could go backward with modulus and integer division.

df$year <- df$cont %/% 12 + 2000L
df$month <- df$cont %% 12 # 12 is set at 0, so fix that with next line.
df$month[df$month == 0L] <- 12L

Here, %% is the modulus operator and %/% is the integer division operator. See ?"%%" for an explanation of these and other arithmetic operators.

Upvotes: 6

MKR
MKR

Reputation: 20095

An option is to use yearmon type from zoo package and then calculate difference of months from Jan 2001 using difference between yearmon type.

library(zoo)

# +1 has been added to difference so that Jan 2001 is treated as 1
df$slNum = (as.yearmon(paste0(df$year, df$month),"%Y%m")-as.yearmon("200001","%Y%m"))*12+1

#   year month obs slNum
# 1 2005     7   2    67
# 2 2005    10   4    70
# 3 2005    12   2    72
# 4 2006     1   4    73
# 5 2006     2   1    74
# 6 2006     7   2    79
# 7 2006     8   1    80
# 8 2006    10   3    82

Data:

df <- read.table(text = 
"year month obs
2005 07    2
2005 10    4
2005 12    2
2006 01    4
2006 02    1
2006 07    2
2006 08    1
2006 10    3",
header = TRUE, stringsAsFactors = FALSE)

Upvotes: 0

phiver
phiver

Reputation: 23598

What you can do is something like the following. First create a dates data.frame with expand.grid so we have all the years and months from 2000 01 to 2018 12. Next put this in the correct order and last add an order column so that 2000 01 starts with 1 and 2018 12 is 228. If you merge this with your original table you get the below result. You can then remove columns you don't need. And because you have a dates table you can return the year and month columns based on the order column.

dates <- expand.grid(year = seq(2000, 2018), month = seq(1, 12))
dates <- dates[order(dates$year, dates$month), ]
dates$order <- seq_along(dates$year)


merge(df, dates, by.x = c("year", "month"), by.y = c("year", "month"))

  year month obs order
1 2005    10   4    70
2 2005    12   2    72
3 2005     7   2    67
4 2006     1   4    73
5 2006    10   3    82
6 2006     2   1    74
7 2006     7   2    79
8 2006     8   1    80

data:

df <- structure(list(year = c(2005L, 2005L, 2005L, 2006L, 2006L, 2006L, 2006L, 2006L), 
                     month = c(7L, 10L, 12L, 1L, 2L, 7L, 8L, 10L), 
                     obs = c(2L, 4L, 2L, 4L, 1L, 2L, 1L, 3L)), 
                class = "data.frame", 
                row.names = c(NA, -8L))

Upvotes: 5

Related Questions