Rspacer
Rspacer

Reputation: 2429

How to extract the first time a non-zero number occurs in dataframe n R?

In an experiment, I'm trying to find the time to first birth. There are four animals as given by id and rep (A1, A2, B1, B2), their age and babies. For each id and rep, I want to only retain the rows where babies were first born

id <- c("A","A","A","A","A","A","B","B","B","B","B","B","B","B","B")
rep <- c(1,1,1,2,2,2,1,1,1,1,2,2,2,2,2)
age <- c(0,1,2,0,1,2,0,1,2,3,0,1,2,3,4)
babies <- c(0,0,1,0,1,0,0,0,0,1,0,0,0,1,1)

df <- data.frame(id,rep,age,babies)

So in here, the final dataframe should look like this

id <- c("A","A","B","B")
rep <- c(1,2,1,2)
age <- c(2,1,3,3)
babies <- c(1,1,1,1)

df <- data.frame(id,rep,age,babies)

Upvotes: 6

Views: 89

Answers (4)

Michael Dewar
Michael Dewar

Reputation: 3293

You only need to group_by and filter:

df %>% 
  group_by(id, rep) %>% 
  filter(babies > 0) %>% 
  filter(age == min(age)) %>% 
  ungroup()

Upvotes: 3

Just James
Just James

Reputation: 1252

An alternative

df |> 
  group_by(id,rep) |> 
  slice(which(c(0, diff(babies)) == 1)) |> 
  ungroup()

This accounts for an individual having more babies as they age

Upvotes: 1

TarJae
TarJae

Reputation: 79132

Here is one with arrange:

library(dplyr)

df %>% 
  group_by(id, rep) %>% 
  arrange(-babies, .by_group = TRUE) %>% 
  slice(1)
  id      rep   age babies
  <chr> <dbl> <dbl>  <dbl>
1 A         1     2      1
2 A         2     1      1
3 B         1     3      1
4 B         2     3      1

Upvotes: 2

akrun
akrun

Reputation: 887501

library(dplyr)
df %>% 
   group_by(id, rep) %>% 
   slice_max(babies, n = 1, with_ties = FALSE) %>%
    ungroup

df %>%
   group_by(id, rep) %>% 
   filter(row_number() == which(babies > 0)[1]) %>% 
   ungroup

Upvotes: 4

Related Questions