Reputation: 6470
In R (or S-PLUS), what is a good way to aggregate String data in a data frame?
Consider the following:
myList <- as.data.frame(c("Bob", "Mary", "Bob", "Bob", "Joe"))
I would like the output to be:
[Bob, 3
Mary, 1
Joe, 1]
Currently, the only way I know how to do this is with the summary function.
> summary(as.data.frame(myList))
Bob :3
Joe :1
Mary:1
This feels like a hack. Can anyone suggest a better way?
Upvotes: 1
Views: 1165
Reputation: 1101
Using data.table
myList <- data.frame(v1=c("Bob", "Mary", "Bob", "Bob", "Joe"))
library(data.table)
v1 N
1: Bob 3
2: Mary 1
3: Joe 1
Upvotes: 1
Reputation: 18838
Using sqldf
library:
require(sqldf)
myList<- data.frame(v=c("Bob", "Mary", "Bob", "Bob", "Joe"))
sqldf("SELECT v,count(1) FROM myList GROUP BY v")
Upvotes: 0
Reputation: 44128
Do you mean like this?
myList <- c("Bob", "Mary", "Bob", "Bob", "Joe")
r <- rle(sort(myList))
result <- as.data.frame(cbind(r$values, r$lengths))
names(result) <- c("Name", "Occurrences")
result
Name Occurrences
1 Bob 3
2 Joe 1
3 Mary 1
Upvotes: 1
Reputation: 3035
This is a combination of the above answers (as suggested by Thierry)
data.frame(table(myList[,1]))
which gives you
Var1 Freq
1 Bob 3
2 Joe 1
3 Mary 1
Upvotes: 2
Reputation: 2289
Using table
, no need to sort:
ctable <- table(myList);
counts <- data.frame(Name = names(ctable),Count = as.vector(ctable));
Upvotes: 2