nchand
nchand

Reputation: 17

Pick rows from dataframe when there is a column value changes in R

im trying to pick the first row and the row where the column's value changes and create a new df.

See image of original df table

So i will have the first row and 5th row in a new df from original df shown above. How do i do that? tried using the below code, but it returns other rows as well. can someone help me fix this?

lst<- which(  df$A!= dplyr::lag(df$A)|
                     df$B!= dplyr::lag(df$B)|
                     df$C!= dplyr::lag(df$C)|
                     df$D!= dplyr::lag(df$D)) 

df_new<- df[lst,] 

Upvotes: 0

Views: 398

Answers (2)

akrun
akrun

Reputation: 887128

Using diff

library(dplyr)
df %>% 
   filter(c(TRUE, diff(rowSums(.)) != 0))
  A C B D
1 0 1 1 0
2 0 1 0 0
3 0 0 0 0

data

df <- structure(list(A = c(0, 0, 0, 0, 0), C = c(1, 1, 1, 1, 0), B = c(1, 
1, 1, 0, 0), D = c(0, 0, 0, 0, 0)), class = "data.frame", row.names = c(NA, 
-5L))

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76412

Since the question includes dplyr code, here is a solution with package dplyr.

library(dplyr)

df %>%
  mutate(rs = rowSums(.),
         rs = rs != lag(rs, default = FALSE)) %>%
  filter(rs) %>%
  select(-rs)
#  A C B D
#1 0 1 1 0
#2 0 1 0 0
#3 0 0 0 0

Data

A <- rep(0, 5)
C <- c(rep(1, 4), 0)
B <- c(rep(1, 3), 0, 0)
D <- rep(0, 5)
df <- data.frame(A, C, B, D)

Upvotes: 1

Related Questions