Reputation: 11
I have a question on data subset based on dynamic column class. For example:
#Coming from other source. Dont exaclty know about their names and number of classes.
#But following are two demography, which will help in imagining the problem
gender <- c(1,2)
agegroup <- c(1,2,3,4,5,6,7,8)
#moredemo.................
# reproducible data
set.seed(1)
col1 <- as.data.frame(rep(gender, 100))
col2 <- as.data.frame(rep(agegroup, 25))
col3 <- runif(200)
datafile <- cbind(col1, col2, col3)
names(datafile)[1] = "gender"
names(datafile)[2] = "agegroup"
datafile <- as.data.frame(datafile)
#Subset is only for gender = 1 and agegroup = 3
#Subset is for every combination of classes in each demography
#No hardcoded name is required, because demography name will not be know
dat_gender_1_agegroup_3 <- datafile[datafile$gender == 1 & datafile$agegroup == 3, ]
But there can be more demography and not just gender and agegroup. There can be income or education or race and so on. each of the demography has varying number of class. Kindly help me in getting the subset of the dataset datafile on the varying number of columns. Thanks in advance
Upvotes: 0
Views: 55
Reputation: 56189
Using expand grid for combos then apply to subset:
#dummy data
set.seed(123)
mydata <- data.frame(gender = sample(1:2, 100, replace = TRUE),
agegroup = sample(1:10, 100, replace = TRUE))
#groups
gender <- c(1,2)
agegroup <- c(1,2,3,4,5,6,7,8)
#get all combo
myCombo <- expand.grid(gender, agegroup)
#result is a list object
apply(myCombo, 1, function(i){
mydata[ mydata$gender == i[1] &
mydata$agegroup == i[2], ]
})
Edit: Based on update, I think you just need split command
split(datafile, datafile[, 1:2])
Upvotes: 1
Reputation: 165
What about (assuming the column names are "gender" and "agegroup"):
gender <- c(1,2)
agegroup <- c(1,2,3,4,5,6,7,8)
data_subset <- subset(full_data, gender%in%gender | agegroup%in%agegroup | [AND SO ON])
You can add as many [column_name]%in%[values] as you want.
HTH a little!
EDIT: you can very well use & instead of |, obviously.
Upvotes: 0