Giuseppe Petri
Giuseppe Petri

Reputation: 640

How to create a new variable based on one condition (is.na) at the minimum of another variable in dplyr?

I need to create a new variable (obs.new) that conserves the original value from obs except when the minimum of date is missing. In those cases, the obs.new value should be the mean.obs value. The other instances where obs is na, should remain na.

This is a reproducible example of what I did:

library(dplyr)

data.1 <-read.csv(text = "
site ,treat,date,obs,mean.obs,
1,a,33,0.585581765,0.4,
1,a,34,0.871886986,0.4,
1,a,35,,0.4,
1,a,36,,0.4,
1,a,37,,0.4,
1,a,38,,0.4,
1,a,39,0.628236902,0.4,
1,a,40,0.041956742,0.4,
1,b,36,,0.52,
1,b,37,0.327067686,0.52,
1,b,38,,0.52,
1,b,39,,0.52,
1,b,40,,0.52,
1,b,41,0.982637394,0.52,
1,b,42,0.80141212,0.52,
1,b,43,0.739522519,0.52,
2,a,56,,0.48,
2,a,57,0.724849037,0.48,
2,a,58,0.050617254,0.48,
2,a,59,,0.48,
2,a,60,,0.48,
2,a,61,,0.48,
2,a,62,,0.48,
2,a,63,0.269993451,0.48,
2,b,23,0.216291392,0.49,
2,b,24,,0.49,
2,b,25,,0.49,
2,b,26,,0.49,
2,b,27,,0.49,
2,b,28,,0.49,
2,b,29,0.951644067,0.49,
2,b,30,0.745131113,0.49")


data.1.1 <- data.1 %>%
  group(site, treat) %>%
  mutate(obs.new = if_else(is.na(slice(which.min(date))),
                           mean.obs, obs))

This is the error I got:

Error: Problem with `mutate()` input `obs.new`.
x no applicable method for 'slice' applied to an object of class "c('integer', 'numeric')"
i Input `obs.new` is `if_else(is.na(slice(which.min(date))), mean.obs, obs)`.
i The error occurred in group 1: site = 1, treat = "a".
Run `rlang::last_error()` to see where the error occurred.

The expected result is this:

enter image description here

Thanks for any hint.

Upvotes: 0

Views: 103

Answers (5)

akrun
akrun

Reputation: 886938

Using data.table

library(data.table)
setDT(data.1)[, mean.obs.new := fifelse(is.na(obs) & date == min(date), mean.obs, obs), .(site, treat)]

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388807

You could replace obs value if obs is NA and date is minimum date in the group.

library(dplyr)

data.1 %>%
  group_by(site, treat) %>%
  mutate(mean.obs.new = ifelse(is.na(obs) & date == min(date), mean.obs, obs))

#    site treat  date     obs mean.obs mean.obs.new
#   <int> <chr> <int>   <dbl>    <dbl>        <dbl>
# 1     1 a        33  0.586      0.4        0.586 
# 2     1 a        34  0.872      0.4        0.872 
# 3     1 a        35 NA          0.4       NA     
# 4     1 a        36 NA          0.4       NA     
# 5     1 a        37 NA          0.4       NA     
# 6     1 a        38 NA          0.4       NA     
# 7     1 a        39  0.628      0.4        0.628 
# 8     1 a        40  0.0420     0.4        0.0420
# 9     1 b        36 NA          0.52       0.52  
#10     1 b        37  0.327      0.52       0.327 
# … with 22 more rows

Upvotes: 1

TarJae
TarJae

Reputation: 78917

change group to group_by then you can use case_when()

library(dplyr)
data.1 %>%
  group_by(site, treat) %>%
  mutate(obs.new = case_when(!is.na(obs) ~ obs,
                             date==min(date) ~ mean.obs,
                             TRUE ~ 0))

Upvotes: 0

Jon Spring
Jon Spring

Reputation: 66415

data.1 %>%
    group_by(site, treat) %>%
    mutate(obs.new = coalesce(obs, 
        if_else(row_number() == 1, mean.obs, NA_real_))) %>%
    ungroup()

# A tibble: 32 x 7
    site treat  date     obs mean.obs X     obs.new
   <int> <chr> <int>   <dbl>    <dbl> <lgl>   <dbl>
 1     1 a        33  0.586      0.4  NA     0.586 
 2     1 a        34  0.872      0.4  NA     0.872 
 3     1 a        35 NA          0.4  NA    NA     
 4     1 a        36 NA          0.4  NA    NA     
 5     1 a        37 NA          0.4  NA    NA     
 6     1 a        38 NA          0.4  NA    NA     
 7     1 a        39  0.628      0.4  NA     0.628 
 8     1 a        40  0.0420     0.4  NA     0.0420
 9     1 b        36 NA          0.52 NA     0.52  
10     1 b        37  0.327      0.52 NA     0.327 
# … with 22 more rows

Upvotes: 0

Jaime Y&#225;&#241;ez
Jaime Y&#225;&#241;ez

Reputation: 98

data.1 %>%
  group_by(site, treat) %>%
  mutate(obs.new = if_else(!is.na(obs), 
                           obs,
                           if_else(date == min(date), 
                                   mean.obs,
                                   0)
                           )
         )

Upvotes: 1

Related Questions