Pinaypy
Pinaypy

Reputation: 57

How to transform data in Column Date using cut function in R

I have a Release date column in my dataset and need to add a column Decade which supposed to have 4 levels of "1980s", "1990s", "2000s", "2010s".

1980s within 1980-01-01 to 1989-12-31

1990s within 1990-01-01 to 1999-12-31 etc.

Sample of Release Date Column

enter image description here

Here is my code so far:

df$Decade <- cut(df$Release, c(1970,1980,1990,2000))
levels(df$Decade) <- c("1980s", "1990s", "2000s", "2010s")

Here's the error I'm getting:

Error in cut.Date(df$Release, 10 + c(1970, 1980, 1990, 2000)) : invalid specification of 'breaks'

Any help will be greatly appreciated.

Upvotes: 0

Views: 248

Answers (2)

Edward
Edward

Reputation: 18683

For "Date" objects, you can't cut like that. I'm sure there's an R base version, but lubridate can make your life easier, if you don't care too much about the how or if you don't want to learn to do things from scratch.

library(lubridate)

Decade <- format(floor_date(Release, years(x=10)), "%Y")

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

One way would be to convert Release into date, extract only first 3 characters of the year. So 199 for 1991 or 198 for 1987 and then add "0s" to get the decade.

df <- data.frame(Release = c('5/21/1980', '12/12/1980', '5/12/1991'))
df$Decade <- paste0(substring(as.Date(x, '%m/%d/%Y'), 1, 3), "0s")
df
#     Release Decade
#1  5/21/1980  1980s
#2 12/12/1980  1980s
#3  5/12/1991  1990s

Upvotes: 0

Related Questions