Reputation: 595
Just suppose you have total number of samples as 8. Data frame looks like that. All the individuals having Healthscore less than 3 are Healthy and All having health scores greater than 3 are Sick. Status shows their employement status.
Status<-(Employed,Unemployed,Student,Student,Employed,Unemployed,Unemployed,Housewife)
Health<-(Healthy,Healthy,Healthy,Sick,Sick,Control,Sick,Sick)
df<-(Status,Health)
level(Health)<-("Healthy,"Sick",Control)
level(Status)<-("Employed","Unemployed","Student","Housewife")
I want to see the percentage of Healthy,Sick or Control people belong to each occupation category. I want output like following. (p.s values are just hypothetical in example) like OUT OF ALL EMPLOYED INDIVIDUALS, HOW MANY PERCENTS ARE HEALTHY??
Healthy Sick Control
Employed 10% 2% 1%
Unemployed 5% 1% 1%
Student 6% 3% 1%
Housewife 2% 5% 6%
I am using following code. But it just gives me frequencies, NOT PERCENTAGE. I need percentage.
tab <- with(df, table(df$Health,df$Status))
Upvotes: 0
Views: 1585
Reputation: 388817
We can count
the number of individuals for each Status
and Health
, group_by
Status and calculate the percentage. For better visibility we cast the data in wide format.
library(dplyr)
df %>%
count(Status, Health) %>%
group_by(Status) %>%
mutate(n = n/sum(n) * 100) %>%
tidyr::pivot_wider(names_from = Health, values_from = n,
values_fill = list(n = 0))
# Status Healthy Sick Control
# <fct> <dbl> <dbl> <dbl>
#1 Employed 50 50 0
#2 Housewife 0 100 0
#3 Student 50 50 0
#4 Unemployed 33.3 33.3 33.3
In base R, we can use prop.table
along with table
to get the percentages.
prop.table(table(df), 1) * 100
data
df <- structure(list(Status = structure(c(1L, 4L, 3L, 3L, 1L, 4L, 4L,
2L), .Label = c("Employed", "Housewife", "Student", "Unemployed"
), class = "factor"), Health = structure(c(2L, 2L, 2L, 3L, 3L,
1L, 3L, 3L), .Label = c("Control", "Healthy", "Sick"),
class = "factor")), class = "data.frame",row.names = c(NA, -8L))
Upvotes: 1