Rene Bern
Rene Bern

Reputation: 565

summarise() - calculating percentages and counts of factor

I'm trying to use summarise() from the plyr-packge to calculate percentages of occurences of each level in a factor. EDIT: The Puromycin data is in the base R installation

My data look like this:

library(plyr)
data.p <- as.data.frame(Puromycin[,3])
names(data.p) <- "Treat.group" 

I've done this:

    summarise(  data.p, "Frequencies"= count(data.p), 
"Percent" = count(data.p)/ sum(count(data.p)[2] ))

And got this:

  Frequencies.Treat.group Frequencies.freq Percent.Treat.group Percent.freq
1                 treated               12                  NA    0.5217391
2               untreated               11                  NA    0.4782609 

But I don't want the 3. column to be generated. It is unnecessary, and only shows NA.

How do I write the code so I don't get that NA column?

Any pointers are appreciated :)

Upvotes: 1

Views: 13493

Answers (1)

csgillespie
csgillespie

Reputation: 60492

Your error was coming from:

count(data.p)/ sum(count(data.p)[2] )

If you look at the numerator, we get:

R> count(data.p)
  Treat.group freq
1     treated   12
2   untreated   11

So the warning occurred because you were dividing the first column by a number, i.e. treated/12, which gives NA. To avoid this, just select the second column of count(data.p):

summarise(data.p, 
             "Frequencies"= count(data.p), 
             "Percent" = count(data.p)[,2]/ sum(count(data.p)[2]))

Upvotes: 4

Related Questions