Reputation: 3941
Given data that looks like this:
Year<-c(1,1,1,1,2,2,2,2,3,3,3,3)
Tax<-c('A','B','C','D','A','B','C','D','A','B','C','D')
Count<-c(1,2,1,2,1,2,1,1,1,2,1,1)
Dummy<-data.frame(Year,Tax,Count)
Dummy
Year Tax Count
1 1 A 1
2 1 B 2
3 1 C 1
4 1 D 2
5 2 A 1
6 2 B 2
7 2 C 1
8 2 D 1
9 3 A 1
10 3 B 2
11 3 C 1
12 3 D 1
How would I go about combining some of the "Tax" elements- for instance if I wanted to combine A,B,C into a new variable "ABC". My end result should look like this
Year Tax Count
1 ABC 4
1 D 2
2 ABC 4
2 D 1
3 ABC 4
3 D 1
Upvotes: 0
Views: 127
Reputation: 57686
Another plyr
solution. Just redefine your Tax
variable and do a normal summary.
ddply(within(Dummy, {
Tax <- ifelse(Tax %in% c('A','B','C'), 'ABC', 'D')
}), .(Year, Tax), summarise, Count=sum(Count))
If you don't have plyr
(or don't like it (!)), this problem is simple enough to handle in base R in a straightforward way.
aggregate(Count ~ Year + Tax, within(Dummy, {
Tax <- ifelse(Tax %in% c('A','B','C'), 'ABC', 'D')
}), sum)
Upvotes: 3
Reputation: 60060
Alright, here is a much better solution than my original one. No empty dataframes, no rbind
ing, but it can still deal with arbitrary groups:
groups_list = list(c("A", "B", "C"), "D")
Dummy$TaxGroup = sapply(Dummy$Tax, function(tax_value) {
group_search = sapply(groups_list, function(group) tax_value %in% group)
group_num = which(group_search)
})
combined = ddply(
Dummy,
.(Year, TaxGroup),
summarize,
GroupName=paste(groups_list[[TaxGroup[1]]], sep="", collapse=""),
CombinedCount=sum(Count)
)
Upvotes: 1
Reputation: 121568
Here an option using ddply
ddply(Dummy,.(Year),summarise,
Tax=c(Reduce(paste0,head(Tax,-1)),as.character(tail(Tax,1))),
Count=c(sum(head(Count,-1)),tail(Count,1)))
Year Tax Count
1 1 ABC 4
2 1 D 2
3 2 ABC 4
4 2 D 1
5 3 ABC 4
6 3 D 1
Upvotes: 1