Reputation: 141
I currently am trying to execute the anesrake function (part of the anesrake package https://cran.r-project.org/web/packages/anesrake/index.html which weights population attribute sets based on sample attribute sets) within R to approximate weight rankings for multiple sets of variables.
I have a table of sample data testData:
Index GENDER AGE
1 Female 18-24
2 Female 35-64
3 Male 65+
Note: age range has 6 levels- 18-24,25-34,35-44,45-54,55-64,65+
I then have a set of 2 lists for my population data:
GENDER <- c(.49,.51)
AGE <- c(.08,.1,.12,.2,.2,.3)
I then create a set of target variables and a CASEID column on the original table:
targets <- list(GENDER, AGE)
names(targets) <- c("GENDER", "AGE")
testData$CASEID <- 1:length(testData$GENDER)
I finally get to see the variance in my population data vs my sample data:
> anesrakefinder(targets, testData, choosemethod = "total")
GENDER AGE
0.1495337 0.3668394
But when I use the anesrake function to do the final analysis, I get thrown errors:
> anesrake(inputter=targets,dataframe=testData,caseid=testData$CASEID)
Error in rakeonvar.default(mat[, i], inputter[[i]], weightvec) :
number of variable levels does not match number of weighting levels
In addition: Warning message:
In rakeonvar.default(mat[, i], inputter[[i]], weightvec) :
NAs introduced by coercion
I've been following two 'tutorials' on how to utilize anesrake but I'm still coming up short. These are the tutorials below:
http://sdaza.com/survey/2012/08/25/raking/
http://surveyinsights.org/wp-content/uploads/2014/07/Full-anesrake-paper.pdf
Any help that you could provide on this would be greatly, greatly appreciated.
Cheers,
Stu
Upvotes: 2
Views: 3176
Reputation: 31
You need to label the levels of the target variables the same as the levels of the data variables using the following example-
names(targets$agecat1) <- levels(rak2$agecat1)
names(targets$newpayer) <- levels(rak2$newpayer)
Upvotes: 3
Reputation: 33
I just solved the same issue by transforming my data from character to factor.
You can try the following:
testData$GENDER <- as.factor(testData$GENDER)
testData$AGE <- as.factor(testData$AGE)
Upvotes: 1