Noura
Noura

Reputation: 474

Delete successive rows under condition in R

I have a data frame

dt <- read.table(text = "
350 16
352 0
354 0
359 0
366 11
376 38  
380 28 
386 0
389 0
397 55 
398 45 
399 0  
400 19  
402 30")

When successive rows contain zero in the second column, I want to keep only the zero row which precedes the non zero row in the second column.

The result must be:

dt <- read.table(text = "
350 16
359 0
366 11
376 38  
380 28 
389 0
397 55 
398 45 
399 0  
400 19  
402 30")

Upvotes: 1

Views: 84

Answers (3)

jogo
jogo

Reputation: 12569

Here is the data.table equivalent to the solution from @iod :

library("data.table")

dt <- fread( 
"350 16
352 0
354 0
359 0
366 11
376 38
380 38
386 0
389 0
397 55
398 45
399 0
400 19
402 30")

dt[V2!=0 | shift(V2, type="lead")!=0]

Upvotes: 0

akrun
akrun

Reputation: 887991

Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post

library(tidyverse)
library(data.table)
dt %>% 
    group_by(grp = rleid(V2 == 0)) %>% 
    filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
    ungroup %>%
    select(-grp)
# A tibble: 11 x 2
#      V1    V2
#   <int> <int>
# 1   350    16
# 2   359     0
# 3   366    11
# 4   376    38
# 5   380    28
# 6   389     0
# 7   397    55
# 8   398    45
# 9   399     0
#10   400    19
#11   402    30

Or using data.table, the same logic can be applied

setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
#     V1 V2
# 1: 350 16
# 2: 359  0
# 3: 366 11
# 4: 376 38
# 5: 380 28
# 6: 389  0
# 7: 397 55
# 8: 398 45
# 9: 399  0
#10: 400 19
#11: 402 30

Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition

setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 

NOTE: These are flexible solutions which can be generalized

Upvotes: 1

iod
iod

Reputation: 7592

Simple one line solution:

dplyr::filter(dt, !(V2==0 & lead(V2)==0))

    V1 V2
1  350 16
2  359  0
3  366 11
4  376 38
5  380 28
6  389  0
7  397 55
8  398 45
9  399  0
10 400 19
11 402 30

Upvotes: 1

Related Questions