zhiwei li
zhiwei li

Reputation: 1711

How to use condition statement in pipe in r

I am trying to use the condition statement in the pipe but failed.

The data like this:

group = rep(letters[1:3], each = 3)
status = c(T,T,T,  T,T,F,  F,F,F)
value  = c(1:9)

df = data.frame(group = group, status = status, value = value)

> df
  group status value
1     a   TRUE     1
2     a   TRUE     2
3     a   TRUE     3
4     b   TRUE     4
5     b   TRUE     5
6     b  FALSE     6
7     c  FALSE     7
8     c  FALSE     8
9     c  FALSE     9

I want to get the rows in each group that have max value with the condition that if any of the status in each group have TRUE then filter(status == T) %>% slice_max(value) or slice_max(value) otherwise.

What I have tried is this:

# way 1
df %>% 
  group_by(group) %>% 
  if(any(status) == T) {
    filter(status == T) %>% slice_max(value)
  } else {
    slice_max(value)
  }

# way 2 
df %>% 
  group_by(group) %>% 
  when(any(status) == T,
    filter(status == T) %>% slice_max(value),
    slice_max(value))

What I expected output should like this:

> expected_df
  group status value
1     a   TRUE     3
2     b   TRUE     5
3     c  FALSE     9

Any help will be highly appreciated!

Upvotes: 0

Views: 652

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 388807

A bit more verbose :

library(dplyr)

df %>%
  group_by(group) %>%
  filter(if(any(status)) value ==max(value[status]) else value == max(value)) %>%
  ungroup

#  group status value
#  <chr> <lgl>  <int>
#1 a     TRUE       3
#2 b     TRUE       5
#3 c     FALSE      9

Upvotes: 1

Onyambu
Onyambu

Reputation: 79188

df %>% 
   group_by(group) %>%
   slice(which.max(value*(all(!status)|status)))
# A tibble: 3 x 3
# Groups:   group [3]
  group status value
  <chr> <lgl>  <int>
1 a     TRUE       3
2 b     TRUE       5
3 c     FALSE      9

Though the best is to arrange the data

Upvotes: 0

MrFlick
MrFlick

Reputation: 206167

Try arranging the data by status then value, then just taking the first result

df %>% 
  group_by(group) %>% 
  arrange(!status, desc(value)) %>% 
  slice(1)

Since we arrange by status, if they have a TRUE value, it will come first, if not, then you just get the largest value. Generally it's a bit awkward to combine pipes and if statements but if that's something you want to look into, that's covered in this existing question but if statements don't work with group_by.

Upvotes: 1

Related Questions