Reputation: 463
Suppose you have a variable year(which is column variable) in a data frame. If we want to create/add a new variable name decade in the same data frame, where if the years are in between 1960 to 1969, the label of decade variable would be sixty and so on. Probably it's not that difficult, but I am new about this type of data formatting.
Upvotes: 0
Views: 809
Reputation: 2904
You can use a custom collection of labels for specific values and loop over the data.
df = data.frame(column = c(1965,1958,1971,1980,1989))
keys = list(list(1950,1959,"fifties"),
list(1960,1969,"sixties"),
list(1970,1979,"seventies"),
list(1980,1989,"eighties"),
list(1990,1999,"nineties"))
df$Label = NA
for(k in keys){
df$Label[df$column >= k[[1]] & df$column <= k[[2]]]=k[[3]]
}
The output of this little program is:
> df
column Label
1 1965 sixties
2 1958 fifties
3 1971 seventies
4 1980 eighties
5 1989 eighties
You can see that it would be quite easy to extend and adapt it to your exact problem.
Upvotes: 0
Reputation: 170
Assuming that you only have years in the XX. century, first you need to make a character variable that stores your decade names:
decades <- c("10s", "20s", "30s", "40s", "50s", "60s", "70s", "80s", "90s")
Or you get the same result with
decades <- paste(1:9 * 10, "s", sep = "")
Then
df$decades <- decades[(df$year - 1900) %/% 10]
Where %/%
is the quotient, and with the help of that you can change your solution according to your needs
Upvotes: 1