user2093526
user2093526

Reputation: 105

Calculate proportions within subsets of a data frame

I am trying to obtain proportions within subsets of a data frame. For example, in this made-up data frame:

DF<-data.frame(category1=rep(c("A","B"),each=9),
    category2=rep(rep(LETTERS[24:26],each=3),2),
     animal=rep(c("dog","cat","mouse"),6),number=sample(18))

I would like like to calculate the proportion of each of the three animals for each category1 by category2 combination (e.g., out of all animals that are both "A" and "X", what proportion are dogs?). With prop.table on column 4 of the data frame I can get the proportion that each row makes up of the total "number" column, but I have not found a way to do this for subsets based on category 1 and 2. I also tried splitting the data by category1 and category2 using this:

splitDF<-split(DF,list(DF$category1,DF$category2))

And I was hoping I could then apply a function with prop.table to get the proportions of each animal within each split group, but I cannot get prop.table working because I can't seem to specify which column of data to apply the function to within the split groups. Does anyone have any tips? Maybe this is possible with plyr or something similar? I can't find anything in the help forums about ways to get proportions within subsets of data.

Upvotes: 7

Views: 29407

Answers (2)

Aditya Sihag
Aditya Sihag

Reputation: 5167

does this produce your desired output ?

 DF$proportion<-as.vector(unlist(tapply(DF$number,paste(DF$category1,DF$category2,sep="."),FUN=function(x){x/sum(x)})));

Upvotes: 3

Didzis Elferts
Didzis Elferts

Reputation: 98529

You can use function ddply() from library plyr to calculate proportions for each combination and then add new column to data frame.

 library(plyr)     
 DF<-ddply(DF,.(category1,category2),transform,prop=number/sum(number))
 DF
   category1 category2 animal number       prop
1          A         X    dog     17 0.44736842
2          A         X    cat      3 0.07894737
3          A         X  mouse     18 0.47368421
4          A         Y    dog      2 0.14285714

Upvotes: 6

Related Questions