elliezee
elliezee

Reputation: 133

Find minimum value in an array that is larger than another column in R

I need to find the minimum values of three columns that are bigger than the values in another column. Say these five individuals entered a hospital in different months of the year, and they suffered several heart attacks before and after hospitalization. I need the first heart attack after hospitalization.


id<-c(100,105,108,200,205)
hosp<-c(3,5,2,6,2)
attack1<-c(1,6,3,4,1)
attack2<-c(4,7,9,10,NA)
attack3<-c(5,10,NA,NA,NA)
out<-c(7,12,11,12,9)

data <- data.frame(id,hosp,attack1,attack2,attack3,out)

   id hosp attack1 attack2 attack3 out
1 100    3       1       4       5   7
2 105    5       6       7      10  12
3 108    2       3       9      NA  11
4 200    6       4      10      NA  12
5 205    2       1      NA      NA   9

So the data should end up looking something like

   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9     NA

This is my attempt which did not work:

min_f<-function(a){
  x<-min(a[a>hosp])
}

data %>% mutate_if(vars(attack1,attack2,attack3),min_f())

Upvotes: 5

Views: 327

Answers (3)

AnilGoyal
AnilGoyal

Reputation: 26228

Nearly similar answer but using dplyr only

library(dplyr, warn.conflicts = F)

data %>% rowwise() %>%
  mutate(afterh = {xx <- select(cur_data(), starts_with('attack')); first(xx[xx > hosp])})

#> # A tibble: 5 x 7
#> # Rowwise: 
#>      id  hosp attack1 attack2 attack3   out afterh
#>   <dbl> <dbl>   <dbl>   <dbl>   <dbl> <dbl>  <dbl>
#> 1   100     3       1       4       5     7      4
#> 2   105     5       6       7      10    12      6
#> 3   108     2       3       9      NA    11      3
#> 4   200     6       4      10      NA    12     10
#> 5   205     2       1      NA      NA     9     NA

Created on 2021-06-12 by the reprex package (v2.0.0)

Upvotes: 1

Anoushiravan R
Anoushiravan R

Reputation: 21928

You can use the following solution.

  • Here c(...) refers to all variables in each row of your data set and I chose only those that starts with attack
  • Then I chose only those values that are greater than the corresponding value of hosp in each row and since you were looking for the first one that is greater than the value of hosp I used first function to extract that
  • ..2 also refers to the value of the second variable hosp in each row
library(dplyr)
library(purrr)

data %>%
  mutate(afterh = pmap_dbl(., ~ {x <- c(...)[3:5]; 
  first(sort(x[x > ..2]))}))

   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9     NA

As an alternative as mentioned by dear Mr. @Greg in a very large data set, we can use min function in place of first(sort)) combination to ensure a faster evaluation time of the following solution. In case there is no value greater than hosp like in the last row min function would return Inf so I made sure that it would return the value 0 instead you can change it with the value you prefer:

data %>%
  mutate(afterh = pmap_dbl(., ~ {x <- c(...)[3:5];
  out <- min(x[x > ..2], na.rm = TRUE);
  if(!is.finite(out)) 0 else out}))

   id hosp attack1 attack2 attack3 out afterh
1 100    3       1       4       5   7      4
2 105    5       6       7      10  12      6
3 108    2       3       9      NA  11      3
4 200    6       4      10      NA  12     10
5 205    2       1      NA      NA   9      0

Upvotes: 4

Oliver
Oliver

Reputation: 8582

data %>% 
  # Nest attack columns
  nest(attacks = starts_with('attack')) %>% 
  # Only one row at a time
  rowwise() %>% 
  # Find first instance for each row
  mutate(afterh = first(attacks[attacks > hosp])) %>% 
  # Unnest attacks
  unnest(attacks)

Upvotes: 2

Related Questions