cando
cando

Reputation: 33

Calculate consecutive occurrences of a value above a threshold in R

I have data for total sales per day for three people. I want to calculate occurrences above 70 sales per day. I then want to calculate the length of each event (event would be when consecutive days are above 70, e.g. an event would be 5 days in a row when sales are over 70 per day, or 2 days in a row when sales are over 70 per day). I would like a column that assigns the length of the event (e.g. 5 for 5 day, 2 for 2day) and when values are below 70 it should be 0. I want to be able to compare the length of events between the different people.

I can calculate occurrences (code below), but I am struggling with calculating the events with consecutive days.

set.seed(1234) 

#Load packages
library(dplyr)
library(lubridate)

# Create data
Date <- seq(as.Date("2010-01-01"), as.Date("2020-01-31"), by="days")
Tim_Sales <- sample(1 : 100, 3683, replace=TRUE) 
John_Sales <- sample(1 : 100, 3683, replace=TRUE) 
Rupert_Sales <- sample(1 : 100, 3683, replace=TRUE) 

# Make dataframe
df <- data.frame(Date, Tim_Sales, John_Sales, Rupert_Sales)

# Identify occurences above 70 sales
df_occurrences <- df %>%
 group_by(Date) %>%
 summarise_all(funs(sum(. >70))) %>%
 ungroup
df_occurrences

Upvotes: 0

Views: 333

Answers (1)

Mel G
Mel G

Reputation: 142

This approach uses as an ifelse statement and mutate to create a new column, based on a set of conditions.

If a salesperson has more than 70 sales, subtract his or her number of sales from 70. Otherwise, the column equals zero. You can read more about the ifelse statement in the help file, accessed by typing ?ifelse.

library(tidyverse)

df_occurrences <- df %>%
  mutate(Tim_70plus= ifelse(Tim_Sales > 70, Tim_Sales- 70, 0),
         John_70plus= ifelse(John_Sales > 70, John_Sales- 70, 0),
         Rupert_70plus= ifelse(Rupert_Sales > 70, Rupert_Sales- 70, 0))

Upvotes: 2

Related Questions