Applying functions on columns by group

Question

I would like to apply a function on sets of data based on their category. Given the following data frame

pet <- c(rep("cat",5),rep("dog",5))
year <- c(rep(1991:1995,2))
karma <- c(5,4,1,1,1,6,4,3,2,6)
df <- data.frame(pet,year,karma)

that looks like this

   pet year karma
1  cat 1991     5
2  cat 1992     4
3  cat 1993     1
4  cat 1994     1
5  cat 1995     1
6  dog 1991     6
7  dog 1992     4
8  dog 1993     3
9  dog 1994     2
10 dog 1995     6

I would like to perform operations on the karma column for each year. If I wanted to apply a function like sum, this can be done with ddply:

ddply(df, .(year),summarize, sum(karma))

How would I apply it to a function I have written myself, for example

calc <- function(d,c){(d*5+c*7)/12}

where d is a value corresponding to the dog's karma for each given year and c corresponding to that of the cat.

Ideally, I would like to have five more entries appended to this data frame with the pet both, a year and the karma value calculated by the function above. What would be the best way of doing that?

(Terribly sorry if this is trivial, but I really couldn't find a similar question this time.)

Lucy · Accepted Answer

You can use spread to make your data frame wide and then mutate to implement your function

library('tidyr')
library('dplyr')
df %>% 
 spread(pet, karma, drop = FALSE) %>% 
 mutate(karma = calc(dog, cat), pet = "both") %>% 
 select(year, pet, karma) %>%
 rbind(df)

Applying functions on columns by group

Answers (1)

Related Questions