Using dplyr or base R, how to efficiently fill a data frame column with values from other columns using specified conditions?

Question

I generate the below data frame column reGrp, using the code shown at the very bottom and using my usual rookie technique for conditionally filling a data frame column. Derivation of reGrp values is explained in the right-most column "Explanation for reGrp values" I manually added below:

  Element grpRnk reSeq reGrp   Explanation of reGrp values
1       B     NA   1.1   1.1   since grpRnk <> 2, use reSeq value 
2       R     NA   1.1   1.1   since grpRnk <> 2, use reSeq value  
3       R      2   2.0   2.0   since grpRnk = 2 and it is the first instance of grpRnk 2, use reSeq value 
4       R      2   3.0   2.0   since grpRnk = 2 and it is not the first instance of grpRnk 2, borrow the reGrp value from the row above
5       B     NA   1.2   1.2   since grpRnk <> 2, use reSeq value 
6       X      1   1.1   1.1   since grpRnk <> 2, use reSeq value 
7       X      1   1.2   1.2   since grpRnk <> 2, use reSeq value

Is there way to more efficiently do this in base R or dplyr, without creating and writing over repeatedly the same column the way I do in the code below in deriving reGrp?

Code:

library(dplyr)

data <- data.frame(
  Element = c("B","R","R","R","B","X","X"),
  grpRnk = c(NA,NA,2,2,NA,1,1),
  reSeq = c(1.1,1.1,2,3,1.2,1.1,1.2)
  )

data %>% 
  mutate(reGrp = ifelse(grpRnk == 2 & is.na(lag(grpRnk)),reSeq,NA)) %>%
  mutate(reGrp = ifelse(is.na(reGrp) & grpRnk == lag(grpRnk) & grpRnk ==2,lag(reGrp),reGrp)) %>%
  mutate(reGrp = ifelse(is.na(reGrp),reSeq,reGrp))

Ma&#235;l · Accepted Answer

Something like this?

library(dplyr)
data %>% 
  mutate(reGrp = case_when(grpRnk != 2 | is.na(grpRnk) ~ reSeq,
                           grpRnk == 2 ~ first(grpRnk[!is.na(grpRnk)])))

output

  Element grpRnk reSeq reGrp
1       B     NA   1.1   1.1
2       R     NA   1.1   1.1
3       R      2   2.0   2.0
4       R      2   3.0   2.0
5       B     NA   1.2   1.2
6       X      1   1.1   1.1
7       X      1   1.2   1.2

Using dplyr or base R, how to efficiently fill a data frame column with values from other columns using specified conditions?

Answers (1)

Related Questions