Reputation: 565
I'm trying to use summarise() from the plyr-packge to calculate percentages of occurences of each level in a factor. EDIT: The Puromycin data is in the base R installation
My data look like this:
library(plyr)
data.p <- as.data.frame(Puromycin[,3])
names(data.p) <- "Treat.group"
I've done this:
summarise( data.p, "Frequencies"= count(data.p),
"Percent" = count(data.p)/ sum(count(data.p)[2] ))
And got this:
Frequencies.Treat.group Frequencies.freq Percent.Treat.group Percent.freq
1 treated 12 NA 0.5217391
2 untreated 11 NA 0.4782609
But I don't want the 3. column to be generated. It is unnecessary, and only shows NA.
How do I write the code so I don't get that NA column?
Any pointers are appreciated :)
Upvotes: 1
Views: 13493
Reputation: 60492
Your error was coming from:
count(data.p)/ sum(count(data.p)[2] )
If you look at the numerator, we get:
R> count(data.p)
Treat.group freq
1 treated 12
2 untreated 11
So the warning occurred because you were dividing the first column by a number, i.e. treated/12
, which gives NA
. To avoid this, just select the second column of count(data.p)
:
summarise(data.p,
"Frequencies"= count(data.p),
"Percent" = count(data.p)[,2]/ sum(count(data.p)[2]))
Upvotes: 4