How can I aggregate data with categorical responses to get the percentage of each response type in R?

Question

I want to get percentages of categorical answer types for different types of questions (TYPE). I have multiple responses for each type for each individual, with multiple, categorical responses (different levels).

1) each individual should be on a different row, and
2) the columns should be the TYPES+Response Level, with the value being percentage of times that particular response level was given for that question type for that individual.

The DATA looks like this:

SUBJECT TYPE    RESPONSE  
John    a   kappa                       
John    b   gamma  
John    a   delta  
John    a   gamma  
Mary    a   kappa   
Mary    a   delta       
Mary    b   kappa  
Mary    a   gamma  
Bill    b   delta  
Bill    a   gamma

The result should look like this:

SUBJECT a-kappa     a-gamma   a-delta   b-kappa     b-gamma b-delta
John    0.33        0.33      0.33      1.00        1.00    0.00
Mary    0.66        0.33      0.00      1.00        0.00    0.00
Bill    1.00        0.00      0.00      0.00        0.00    1.00

Based on c1au61o_HH's answer I was able to create something that works for my actual data file, but will still need some post-processing. (It is also not very elegant, but that's a minor concern.)

 Finaldf <- mydata %>%     
 group_by(Subject,Type) %>%     
 mutate(TOT = n()) %>%      
 group_by(Subject, Response, Type) %>%     
 mutate(RESPTOT = n())     

 Finaldf <- distinct(Finaldf)    
 Finaldf$Percentage <- Finaldf$RESPTOT/Finaldf$TOT

Any help is much appreciated, also please with some explanation.

How can I aggregate data with categorical responses to get the percentage of each response type in R?

Answers (1)

Related Questions