timothy
timothy

Reputation: 17

Sum different levels of a vector together within id

'My data could look like this

id <- c('A1','A1','A1','A1','B2','B2','B2','B2','C3','C3','C3','C3')
event <- c('a', 'b', 'c', 'd','a', 'b', 'c', 'd','a', 'b', 'c', 'd')
value <- c(3,2,5,3,6,5,7,6,4,5,6,7)
Dat <- data.frame(id, event, value)

Now what i would like to do is sum certain values together based on a different levels of an event within id. For example, within each id combining a, b and c that would result a new level, lets say comb_abc (for id A1 that would be 10). Then, ID A1 would have only two levels on an event vector "comb_abc" = 10 and "some_name" (d) = 3. Here i am changing the levels a, b and c to comb_abc and d to some_name. And the same would happen for each id. How can i do that?

THNAK YOU!!

Upvotes: 0

Views: 52

Answers (2)

ThomasIsCoding
ThomasIsCoding

Reputation: 101638

Here is another base R option with aggregate

aggregate(
  value ~ id + cbind(event = c("some_name", "comb_abc")[1 + event %in% c("a", "b", "c")]),
  Dat,
  sum
)

which gives

  id     event value
1 A1  comb_abc    10
2 B2  comb_abc    18
3 C3  comb_abc    15
4 A1 some_name     3
5 B2 some_name     6
6 C3 some_name     7

If you have more than one level to combine, here is a small example showing you a possible option

set.seed(1)
v <- sample(letters[1:8],20,replace = TRUE)
comb <- list(c("a","b","c"),c("d","e","f"),c("g","h"))
res <- sapply(comb, paste0,collapse = "")[Reduce(`+`,lapply(seq_along(comb), function(k) k*(v %in% comb[[k]])))]

which gives

> res
 [1] "abc" "def" "gh"  "abc" "abc" "def" "gh"  "abc" "def" "abc" "abc" "abc"
[13] "abc" "def" "def" "abc" "def" "def" "abc" "gh"

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388992

You can change the values of 'a', 'b' and 'c' into 'comb_abc' and change rest of them ('d') to 'some_name' and take sum of value for each id and event.

aggregate(value~id+event, transform(Dat, 
         event = ifelse(event %in% c('a','b','c'), 'comb_abc', 'some_name)), sum)

In dplyr this can be done as :

library(dplyr)

Dat %>%
  mutate(event = if_else(event %in% c('a','b','c'), 'comb_abc', 'some_name')) %>%
  group_by(id, event) %>%
  summarise(value = sum(value))

#  id    event     value
#  <chr> <chr>     <dbl>
#1 A1    comb_abc     10
#2 A1    some_name     3
#3 B2    comb_abc     18
#4 B2    some_name     6
#5 C3    comb_abc     15
#6 C3    some_name     7

Upvotes: 1

Related Questions