NorthLattitude
NorthLattitude

Reputation: 211

Finding the first row after which x rows meet some criterium in R

A data wrangling question:

I have a dataframe of hourly animal tracking points with columns for id, time, and whether the animal is on land or in water (0 = water; 1 = land). It looks something like this:

set.seed(13)
n <- 100
dat <- data.frame(id = rep(1:5, each = 10),
                  datetime=seq(as.POSIXct("2020-12-26 00:00:00"), as.POSIXct("2020-12-30 3:00:00"), by = "hour"),
                  land = sample(0:1, n, replace = TRUE))

What I need to do is flag the first row after which the animal uses land at least once for 3 straight days. I tried doing something like this:


dat$ymd <- ymd(dat$datetime[1]) # make column for year-month-day

# add land points within each id group

land.pts <- dat %>% 
  group_by(id, ymd) %>%
  arrange(id, datetime) %>%
  drop_na(land) %>%
  mutate(all.land = cumsum(land))

#flag days that have any land points

flag <- land.pts %>%
  group_by(id, ymd) %>%
  arrange(id, datetime) %>%
  slice(n()) %>%
  mutate(flag = if_else(all.land == 0,0,1))

# Combine flagged dataframe with full dataframe

comb <- left_join(land.pts, flag)
comb[is.na(comb)] <- 1

and then I tried this:

x = comb %>% 
  group_by(id) %>% 
  arrange(id, datetime) %>% 
  mutate(time.land=ifelse(land==0 | is.na(lag(land)) | lag(land)==0 | flag==0, 
                          0,
                          difftime(datetime, lag(datetime), units="days"))) 

But I still can't quite wrap my head around what to do to make it so that I can figure out when the animal has been on land at least once for three days straight, and then flag that first point on land. Thanks so much for any help you can provide!

Upvotes: 0

Views: 358

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

Create a date column from the timestamp. Summarise the data and keep only 1 row for each id and date which shows whether the animal was on land even once in the entire day.

Use zoo's rollapply function to mark the first day as TRUE if the next 3 days the animal was on land.

library(dplyr)
library(zoo)

dat <- dat %>% mutate(date = as.Date(datetime))

dat %>%
  group_by(id, date) %>%
  summarise(on_land = any(land == 1)) %>%
  mutate(consec_three = rollapply(on_land, 3,all, align = 'left', fill = NA)) %>%
  ungroup %>%
  #If you want all the rows of the data
  left_join(dat, by = c('id', 'date'))

Upvotes: 2

Related Questions