Reputation: 1044
I have a data set have
with the following structure:
have <- tibble::tibble(id = 1:30,
state = rep(c("A", "B", "C"), each = 10),
score = c(147, 735, 519, 458, 599, 628, 988, 787, 298, 612,
319, 715, 248, 637, 239, 254, 601, 702, 902, 867,
343, 535, 730, 518, 277, 612, 869, 865, 227, 641),
weight = c(3.13, 1.46, 2.57, 4.39, 1.32, 3.81, 1.29, 1.58, 2.74, 4.13,
1.43, 1.29, 1.81, 3.87, 3.10, 1.18, 4.15, 4.35, 3.35, 3.59,
4.69, 3.38, 3.51, 3.35, 2.60, 1.99, 2.34, 4.60, 3.77, 1.31))
I would like to add a column with weighted terciles wterciles
and another column with weighted quartiles wquartiles
groups of score
within each state
incorporating sample weights weight
.
The weight variable is a frequency expansion weight for computing point estimates. It is a variable needed to take into account because not all students who should take the test actually did it, so the institution responsible for the test calculated weights for students based on the attendance rate of each school and socioeconomic group. Therefore, the weight of 2 implies that this student should be counted as if he was 2 students when considering the state average.
I prefer {dplyr}
syntax, but couldnt find a way of using weights in it. The dplyr::ntile()
function does not handle weights.
Without weights it would be something like:
library(dplyr)
have %>%
group_by(state) %>%
mutate(
wtercile = ntile(score, 3),
wquartile = ntile(score, 4)) %>%
ungroup()
Upvotes: 2
Views: 86