Reputation: 13334
Here is my data frame:
Col1 Col2 Col3 Day
1 12 2 14031100000417017 70
2 10 3 14031100000537996 6
3 7 4 14031100000636479 11
4 12 2 14031100001124050 70
5 12 2 14031100001712518 70
6 10 3 14031100001920801 6
How can I create a new data frame that shows me how many times Col1
, Col2
and Day
occur with the same values? So in this example I would get:
Col1 Col2 Col3 Day Count
1 12 2 14031100000417017 70 3
2 10 3 14031100000537996 6 2
3 7 4 14031100000537996 11 1
Upvotes: 0
Views: 47
Reputation: 6355
Or using plyr:
require(plyr)
ddply(df, .(Col1, Col2, Day), summarize, cnt = length(Col1))
or aggregate
:
aggregate(Col3 ~ Col1 + Col2 + Day, df, length)
Col1 Col2 Day Col3
1 10 3 6 2
2 7 4 11 1
3 12 2 70 3
Upvotes: 0
Reputation: 78590
This is a good case for using the dplyr
package, described in detail here.
library(dplyr)
my.dataframe %.% group_by(Col1, Col2, Day) %.% summarise(Count=n())
This can also be written as
summarise(group_by(my.dataframe, Col1, Col2, Day), Count=n())
Upvotes: 1
Reputation: 27388
And to round it off, here's a data.table
solution:
library(data.table)
mydt <- data.table(mydf)
mydt[, length(Col3), list(Col1, Col2, Day)]
# Col1 Col2 Day V1
# 1: 12 2 70 3
# 2: 10 3 6 2
# 3: 7 4 11 1
The third argument is a list of grouping factors, while the second arg is the function that you'd like to apply to each group.
Upvotes: 0
Reputation: 193517
In base R, you can just use ave
to create a "count" column. You don't really specify how you want to deal with "Col3", so I've left that out:
mydf$count <- ave(rep(1, nrow(mydf)),
mydf[c("Col1", "Col2", "Day")],
FUN = length)
mydf
# Col1 Col2 Col3 Day count
# 1 12 2 1.40311e+16 70 3
# 2 10 3 1.40311e+16 6 2
# 3 7 4 1.40311e+16 11 1
# 4 12 2 1.40311e+16 70 3
# 5 12 2 1.40311e+16 70 3
# 6 10 3 1.40311e+16 6 2
unique(mydf[c("Col1", "Col2", "Day", "count")])
# Col1 Col2 Day count
# 1 12 2 70 3
# 2 10 3 6 2
# 3 7 4 11 1
Upvotes: 0