NinjaR
NinjaR

Reputation: 623

Replace subsequent values of a column if a condition is met

I have a dataframe where I want to replace the values of a column till a certain number of rows if a condition is met

Example dataframe

A     B      C

10    15     0
11    11     6
3     12     0
4     6      9
5     9      0
7     8      0
7     1      8

If the value of any row in C is greater than 0, then the next 5 values of C need to be replaced with 0 resulting in the following dataframe

A     B      C

10    15     0
11    11     6
3     12     0
4     6      0
5     9      0
7     8      0
7     1      0

Any ways to achieve this without loops in R?

Upvotes: 1

Views: 282

Answers (2)

r2evans
r2evans

Reputation: 160407

This is a "rolling window" operation, so I suggest the use of zoo::rollapply.

dat <- read.table(header=TRUE, text='
A     B      C
10    15     0
11    11     6
3     12     0
4     6      9
5     9      0
7     8      0
7     1      8')

zoo::rollapply(dat$C, 5+1, FUN=function(a) {
  l <- length(a)
  if (any(a[-l] != 0)) 0 else a[l]
}, fill = NA, align = "right", partial = TRUE)
# [1] 0 6 0 0 0 0 0

(I used 5+1 to denote that we need one more than the rows we want to blank, as indicated by looking at a[-l]. Many thanks to @IceCreamToucan for pointing that out.)

Upvotes: 3

IceCreamToucan
IceCreamToucan

Reputation: 28675

First, check which(df$C > 0). For each element in the results, add 1:5 using lapply. Then, remove any duplicates in the results with unique (after unlisting). Finally, make sure no rows are greater than nrow(df) with pmin, and set df$C equal to 0 for those rows.

df$C[pmin(nrow(df), unique(unlist(lapply(which(df$C > 0), `+`, 1:5))))] <- 0

Since you originally tagged dplyr, here's a tidyverse method (doesn't modify df unless you assign the result to df)

library(tidyverse)

inds <- 
  which(df$C > 0) %>% 
    map(~ . + 1:5) %>% 
    unlist %>% 
    unique %>% 
    pmin(nrow(df))

df %>% 
  mutate(C = replace(C, inds, 0))

edit: actually it seems like the unique step is unecessary. Not sure if there's any performance impact of leaving repeated indices in, or if so how that compares to using unique

Upvotes: 5

Related Questions