Dong
Dong

Reputation: 491

dplyr: apply different functions to different groups

I am a beginner trying to use dplyr for do data analysis. My data basically are from a few Operations ("Ops") and are well ordered. I often need to apply different functions to the observations("Num") according to the type of Operations, then combine them for analysis.

Trivial example is below:

  X      Num  Ops
  0       37   S
  1       18   R
  2       11   S
  3        3   R
  4       11   S
  5       13   R
  ...     ... ...

I want to add a new column "Num2", according to the values column "Ops", e.g.:

df %〉% mutate(Num2=ifelse(Ops="S",Num-1, Num+1))

I am not sure if I should do a lot of ifelse assignments -- it feels redundant and inefficient.

There must be a much better solution, maybe using some combinations of "group_by, select, filter". Any suggestions?

Basically I want to figure out if there is a way to group the data according to certain criteria, then apply different functions to different subsets, and finally merge the results back together. Typical dplyr examples I found apply the same function(s) to all subsets.

@eddi below provided a more general solution using data.table. Is there a dplyr equivalent?

Upvotes: 4

Views: 1316

Answers (2)

shadow
shadow

Reputation: 22293

There is a dplyrExtras package that includes a mutate_if function.

# install dplyrExtras
library(devtools)
install_github(repo="skranz/dplyrExtras")
require(dplyrExtras)
# code using mutate_if
df %>% 
  mutate(Num2 = Num+1) %>% 
  mutate_if(Ops=="S", Num2 = Num-1)

Upvotes: 1

shadow
shadow

Reputation: 22293

You can easily avoid the ifelse for numeric return values. Just convert the condition to numeric and use appropriate numeric calculations.

df %>% mutate(Num2 = Num - 2*(Ops=="S") + 1)

Upvotes: 0

Related Questions