virginia_wolff
virginia_wolff

Reputation: 39

Replacing dataframe value given multiple condition from another dataframe with R

I have two dataframes one with the dates (converted in months) of multiple survey replicates for a given grid cell and the other one with snow data for each month for the same grid cell, they have a matching ID column to identify the cells. What I would like to do is to replace in the first dataframe, the one with months of survey replicates, the month value with the snow value for that month considering the grid cell ID. Thank you

CellID <- c(1,2,3,4,5,6)
sampl1 <- c("oct", "oct", "oct", "nov", NA, NA)
sampl2 <- c("nov", "nov", "jan", NA, NA, NA)
sampl3 <- c("dec", "dec", "jan", NA, NA, NA)
df1 <- data.frame(CellID, sampl1, sampl2, sampl3)
print(df1)

CellID <- c(1,2,3,4,5,6)
oct <- c(0.1, 0.1, 0.1, 0.1, 0.1, 0.1)
nov <- c(0.4, 0.5, 0.4, 0.5, 0.6, 0.5)
dec <- c(0.6, 0.7, 0.8, 0.7, 0.6, 0.8)
df2 <- data.frame(CellID, oct, nov, dec)
print(df2)

CellID <- c(1,2,3,4,5,6)
sampl1_snow <- c(0.1, 0.1, 0.1, 0.5, NA, NA)
sampl2_snow <- c(0.4, 0.5, 0.9, NA, NA, NA)
sampl3_snow <- c(0.6, 0.7, 1, NA, NA, NA)
df3 <- data.frame(CellID, sampl1_snow, sampl2_snow, sampl3_snow)
print(df3)

Upvotes: 2

Views: 220

Answers (3)

akrun
akrun

Reputation: 887851

In base R, we can use match

df1[-1] <- df2[-1][cbind(df1$CellID[col(df1[-1])], 
         match(as.matrix(df1[-1]), names(df2)[-1]))]

Upvotes: 0

Martin Gal
Martin Gal

Reputation: 16998

You could use

library(purrr)
library(dplyr)

df1 %>% 
  mutate(
    across(
      starts_with("sampl"),
      ~imap_dbl(.x, ~ifelse(is.null(df2[.y, .x]), NA_real_, df2[.y, .x])),
      .names = "{.col}_snow"
      ),
    .keep = "unused"
    )

to get

  CellID sampl1_snow sampl2_snow sampl3_snow
1      1         0.1         0.4         0.6
2      2         0.1         0.5         0.7
3      3         0.1         0.9         0.9
4      4         0.5          NA          NA
5      5          NA          NA          NA
6      6          NA          NA          NA

Data

For df2 I used

structure(list(CellID = c(1, 2, 3, 4, 5, 6), oct = c(0.1, 0.1, 
0.1, 0.1, 0.1, 0.1), nov = c(0.4, 0.5, 0.4, 0.5, 0.6, 0.5), dec = c(0.6, 
0.7, 0.8, 0.7, 0.6, 0.8), jan = c(0, 0, 0.9, 0, 0, 0)), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 1

Wimpel
Wimpel

Reputation: 27802

df3 <- df1
df3[!is.na(df1)] <- df2[!is.na(df1)]
#   CellID sampl1 sampl2 sampl3
# 1      1    0.1    0.4    0.6
# 2      2    0.1    0.5    0.7
# 3      3    0.1    0.4    0.8
# 4      4    0.1   <NA>   <NA>
# 5      5   <NA>   <NA>   <NA>
# 6      6   <NA>   <NA>   <NA>

Upvotes: 0

Related Questions