Reputation: 457
How do you properly standardize a variable that was collected with quota sampling? Let me explain.
I am working with survey data that was collected with quota sampling. Each quota is a different village (1500 in total). The questionnaire was applied to 10% of each village's population. The villages vary a lot in size: from tens of thousands to a mere hundreds.
I am working with logit models and want to standardize one of my dataframe's columns. Should I standardize as is? Or would the population imbalance between the villages bias my standardize variable? Should I include population weights?
To illustrate with data, lets imagine there are only two villages (village 1 is big and village 2 is small). This is how the data would look like
total1 <- data.frame("response1" = c(0.4, -0.1, 2.1, 0.08, 0, -2.5),
"village.number" = c(1, 1, 1, 1, 2, 2))
The question stands: how do I standardize response1 when observations from village 1 double those of village 2. Thank you.
Upvotes: 0
Views: 79