Reputation: 33
I have data for total sales per day for three people. I want to calculate occurrences above 70 sales per day. I then want to calculate the length of each event (event would be when consecutive days are above 70, e.g. an event would be 5 days in a row when sales are over 70 per day, or 2 days in a row when sales are over 70 per day). I would like a column that assigns the length of the event (e.g. 5 for 5 day, 2 for 2day) and when values are below 70 it should be 0. I want to be able to compare the length of events between the different people.
I can calculate occurrences (code below), but I am struggling with calculating the events with consecutive days.
set.seed(1234)
#Load packages
library(dplyr)
library(lubridate)
# Create data
Date <- seq(as.Date("2010-01-01"), as.Date("2020-01-31"), by="days")
Tim_Sales <- sample(1 : 100, 3683, replace=TRUE)
John_Sales <- sample(1 : 100, 3683, replace=TRUE)
Rupert_Sales <- sample(1 : 100, 3683, replace=TRUE)
# Make dataframe
df <- data.frame(Date, Tim_Sales, John_Sales, Rupert_Sales)
# Identify occurences above 70 sales
df_occurrences <- df %>%
group_by(Date) %>%
summarise_all(funs(sum(. >70))) %>%
ungroup
df_occurrences
Upvotes: 0
Views: 333
Reputation: 142
This approach uses as an ifelse
statement and mutate
to create a new column, based on a set of conditions.
If a salesperson has more than 70 sales, subtract his or her number of sales from 70. Otherwise, the column equals zero. You can read more about the ifelse
statement in the help file, accessed by typing ?ifelse
.
library(tidyverse)
df_occurrences <- df %>%
mutate(Tim_70plus= ifelse(Tim_Sales > 70, Tim_Sales- 70, 0),
John_70plus= ifelse(John_Sales > 70, John_Sales- 70, 0),
Rupert_70plus= ifelse(Rupert_Sales > 70, Rupert_Sales- 70, 0))
Upvotes: 2