JayBee
JayBee

Reputation: 143

How to remove zero values until the first non-zero value occurs in an R dataframe?

The title says it all! I have grouped data where I'd like to remove rows up until the first 0 value by id group.

Example code:

problem <- data.frame(
  id = c(1,1,1,1,2,2,2,2,3,3,3,3), 
  value = c(0,0,2,0,0,8,4,2,1,7,6,5)
)


solution <- data.frame(
  id = c(1,1,2,2,2,3,3,3,3), 
  value = c(2,0,8,4,2,1,7,6,5)
)

Upvotes: 1

Views: 1144

Answers (2)

sbha
sbha

Reputation: 10432

Here is a dplyr solution:

library(dplyr)
problem %>% 
  group_by(id) %>% 
  mutate(first_match = min(row_number()[value != 0])) %>% 
  filter(row_number() >= first_match) %>% 
  select(-first_match) %>% 
  ungroup()

# A tibble: 9 x 2
     id value
  <dbl> <dbl>
1     1     2
2     1     0
3     2     8
4     2     4
5     2     2
6     3     1
7     3     7
8     3     6
9     3     5

Or more succinctly per Tjebo's comment:

problem %>% 
  group_by(id) %>% 
  filter(row_number() >= min(row_number()[value != 0])) %>% 
  ungroup()

Upvotes: 3

moodymudskipper
moodymudskipper

Reputation: 47350

You can do this in base R:

subset(problem,ave(value,id,FUN=cumsum)>0)
#    id value
# 3   1     2
# 4   1     0
# 6   2     8
# 7   2     4
# 8   2     2
# 9   3     1
# 10  3     7
# 11  3     6
# 12  3     5

Use abs(value) if you have negative values in your real case.

Upvotes: 1

Related Questions