Reputation: 449
I'm running a frequency table of several variables in my data like so:
frequencies1 <- data %>%
gather(variable, value,
Age,
Gender,
Comparison,
Tenure
) %>%
group_by(variable, value) %>%
summarise (n = n()) %>%
mutate(freq = ((n / sum(n, na.rm = TRUE)) * 100))
The output:
variable value n freq
Age 18-24 9 0.692
Age 25-29 2 0.153
Age 30-34 2 0.153 <- age % = 100% of 13
Comparison 4 0.307
Comparison Better 2 0.153
Comparison Same 4 0.307
Comparison Worse 3 0.230 <- comparison % = 100% of 13
WhereSeen_Facebook 1 2 0.153
WhereSeen_Instagram 1 5 0.384
WhereSeen_TV 1 9 0.692
Tenure 1 0.076
Tenure 1 year or less 12 0.923
Each variable does not have the same amount of responses (i.e. Age n=13, Comparison n=9, etc...) which ends up showing as blanks under value. Is there a way to run the frequencies and filter out empty rows per variable? The proportions are based on the total sample when they should be only for the non-empty values.
Desired output:
variable value n freq
Age 18-24 9 0.692
Age 25-29 2 0.153
Age 30-34 2 0.153 <- age % = 100% of n = 13
Comparison Better 2 0.222
Comparison Same 4 0.444
Comparison Worse 3 0.333 <- comparison % = 100% of n = 9
WhereSeen_Facebook 1 2
WhereSeen_Instagram 1 5
WhereSeen_TV 1 9
Tenure 1 year or less 12
Upvotes: 0
Views: 205
Reputation: 170
I think you can add in a filter(value != "")
(assuming value is a string and the empty value is just "") before you group_by
.
So you'd have:
frequencies1 <- data %>%
gather(variable, value,
Age,
Gender,
Comparison,
Tenure
) %>%
filter(value != "") %<%
group_by(variable, value) %>%
summarise (n = n()) %>%
mutate(freq = ((n / sum(n, na.rm = TRUE)) * 100))
Upvotes: 1