Reputation: 2043
I generate the below data frame column reGrp
, using the code shown at the very bottom and using my usual rookie technique for conditionally filling a data frame column. Derivation of reGrp
values is explained in the right-most column "Explanation for reGrp values" I manually added below:
Element grpRnk reSeq reGrp Explanation of reGrp values
1 B NA 1.1 1.1 since grpRnk <> 2, use reSeq value
2 R NA 1.1 1.1 since grpRnk <> 2, use reSeq value
3 R 2 2.0 2.0 since grpRnk = 2 and it is the first instance of grpRnk 2, use reSeq value
4 R 2 3.0 2.0 since grpRnk = 2 and it is not the first instance of grpRnk 2, borrow the reGrp value from the row above
5 B NA 1.2 1.2 since grpRnk <> 2, use reSeq value
6 X 1 1.1 1.1 since grpRnk <> 2, use reSeq value
7 X 1 1.2 1.2 since grpRnk <> 2, use reSeq value
Is there way to more efficiently do this in base R or dplyr, without creating and writing over repeatedly the same column the way I do in the code below in deriving reGrp
?
Code:
library(dplyr)
data <- data.frame(
Element = c("B","R","R","R","B","X","X"),
grpRnk = c(NA,NA,2,2,NA,1,1),
reSeq = c(1.1,1.1,2,3,1.2,1.1,1.2)
)
data %>%
mutate(reGrp = ifelse(grpRnk == 2 & is.na(lag(grpRnk)),reSeq,NA)) %>%
mutate(reGrp = ifelse(is.na(reGrp) & grpRnk == lag(grpRnk) & grpRnk ==2,lag(reGrp),reGrp)) %>%
mutate(reGrp = ifelse(is.na(reGrp),reSeq,reGrp))
Upvotes: 0
Views: 772
Reputation: 51894
Something like this?
library(dplyr)
data %>%
mutate(reGrp = case_when(grpRnk != 2 | is.na(grpRnk) ~ reSeq,
grpRnk == 2 ~ first(grpRnk[!is.na(grpRnk)])))
output
Element grpRnk reSeq reGrp
1 B NA 1.1 1.1
2 R NA 1.1 1.1
3 R 2 2.0 2.0
4 R 2 3.0 2.0
5 B NA 1.2 1.2
6 X 1 1.1 1.1
7 X 1 1.2 1.2
Upvotes: 1