Conditional Summary in R: MaxSum

Question

I'd have a data frame of authors in a much larger data set than the example in R that I'd like to get better descriptive's of. I know (kinda of) how to get the maxsum but how could I get the max summary of unique authors EXCEPT for the top 2 most frequent authors for example? How would I then be able to determine the new maxsum? How would I get the actual summary that the new maxsum would be 3 instead of an output of it?

I'm basically looking for conditional way's of summarizing my data. Can anyone help me out in this department?

dat <- data.frame(author=c("a", "b", "c", "d", "a", "b", "c", "d", "e", "a", "a", "a","a", "a", "c","c","c","c"),Post=c("one", "one", "one", "one", "one", "one", "one", "one", "one", "one","one", "one","one", "one","one", "one","one", "one"))
authors <-dat[,1]
author_vec <- (authors)
length(unique(author_vec)) #5
ex_s <- summary(as.factor(neg.author_vec),maxsum=5)

Tim Biegeleisen · Accepted Answer

Here is an approach using the plyr library:

require(plyr)
temp <- ddply(dat, ~author, summarise, sum=length(author))
temp <- temp[order(-temp$sum), ][3:nrow(temp), ]

> temp
  author sum
2      b   2
4      d   2
5      e   1

The authors a and c have been removed because they were two most frequently appearing authors in the data set.

Conditional Summary in R: MaxSum

Answers (2)

Related Questions