Reputation: 530
I'd have a data frame of authors in a much larger data set than the example in R that I'd like to get better descriptive's of. I know (kinda of) how to get the maxsum
but how could I get the max summary of unique authors EXCEPT for the top 2 most frequent authors for example? How would I then be able to determine the new maxsum
? How would I get the actual summary that the new maxsum
would be 3 instead of an output of it?
I'm basically looking for conditional way's of summarizing my data. Can anyone help me out in this department?
dat <- data.frame(author=c("a", "b", "c", "d", "a", "b", "c", "d", "e", "a", "a", "a","a", "a", "c","c","c","c"),Post=c("one", "one", "one", "one", "one", "one", "one", "one", "one", "one","one", "one","one", "one","one", "one","one", "one"))
authors <-dat[,1]
author_vec <- (authors)
length(unique(author_vec)) #5
ex_s <- summary(as.factor(neg.author_vec),maxsum=5)
Upvotes: 2
Views: 539
Reputation: 521639
Here is an approach using the plyr
library:
require(plyr)
temp <- ddply(dat, ~author, summarise, sum=length(author))
temp <- temp[order(-temp$sum), ][3:nrow(temp), ]
> temp
author sum
2 b 2
4 d 2
5 e 1
The authors a
and c
have been removed because they were two most frequently appearing authors in the data set.
Upvotes: 1
Reputation: 263381
It wasn't clear how many you expected after exclusion of the top 2. This assumes you wanted the next three in frequency (since you said you understood how maxsum was acting). If you wanted the next five, then add two to your current maxsum::
ex_s <- sort(summary(author_vec,maxsum=5), decreasing=TRUE)[-(1:2)]
ex_s
#------
b d e
2 2 1
Upvotes: 0