Reputation: 163
I have a data frame describing a large number of people. I want to assign each person to a group, based on several variables. For example, let's say I have the variable "state" with 5 states, the variable "age group" with 4 groups and the variable "income" with 5 groups. I will have 5x4x5 = 100 groups, that I want to name with numbers going from 1 to 100. I have always done this in the past using a combination of ifelse statements, but now as I have 100 possible outcomes I am wondering if there is a faster way than specifying each combination by hand.
Here's a MWE with the expected outcome:
mydata <- as.data.frame(cbind(c("FR","UK","UK","IT","DE","ES","FR","DE","IT","UK"),
c("20","80","20","40","60","20","60","80","40","60"),c(1,4,2,3,1,5,5,3,4,2)))
colnames(mydata) <- c("Country","Age","Income")
group_grid <- transform(expand.grid(state = c("IT","FR","UK","ES","DE"),
age = c("20","40","60","80"), income = 1:5), val = 1:100)
desired_result <- as.data.frame(cbind(c("FR","UK","UK","IT","DE","ES","FR","DE","IT","UK"),
c("20","80","20","40","60","20","60","80","40","60"),
c(1,4,2,3,1,5,5,3,4,2),
c(2,78,23,46,15,84,92,60,66,33)))
colnames(desired_result) <- c("Country","Age","Income","Group_code")
Upvotes: 0
Views: 116
Reputation: 887941
Here is left_join
option using dplyr
library(dplyr)
grpD <- group_grid %>%
mutate_if(is.factor, as.character) %>% #change to character class as joining
mutate(income = as.character(income))#with same class columns are reqd.
mydata %>%
mutate_if(is.factor, as.character) %>% #change class here too
left_join(., grpD, by= c("Country" = "state", "Age" = "age", "Income" = "income"))
# Country Age Income val
#1 FR 20 1 2
#2 UK 80 4 78
#3 UK 20 2 23
#4 IT 40 3 46
#5 DE 60 1 15
#6 ES 20 5 84
#7 FR 60 5 92
#8 DE 80 3 60
#9 IT 40 4 66
#10 UK 60 2 33
Upvotes: 0
Reputation: 10437
mydata$Group_code <- with(mydata, as.integer(interaction(Country, Age, Income)))
should do it.
Upvotes: 1