Reputation: 1155
I am trying to sample a dataset based on date class column, quarterly for "Active" and monthly for "Inactive"
Here's my code:
library(dplyr)
library(lubridate)
## data ##
df <- structure(list(
mes = c("01/01/2000", "01/02/2000", "01/03/2000",
"01/04/2000", "01/05/2000", "01/06/2000", "01/07/2000", "01/08/2000",
"01/09/2000", "01/10/2000", "01/11/2000", "01/12/2000"),
status = c("Active", "Inactive",
"Active", "Inactive",
"Active", "Inactive",
"Active", "Active",
"Inactive", "Active",
"Inactive", "Active")),
class = "data.frame",
row.names = c(NA, -12L))
## setting date class for "mes" column ##
df$mes <- as.Date(df$mes,
format = "%d/%m/%Y")
## sampling ##
sample_df <- df %>%
dplyr :: filter(status %in% "Active",
status %in% "Inactive") %>%
dplyr :: filter_if(status == "Active",
month(mes) %in% c(3,6,9,12),
month(mes) %in% c(1,2,3,4,5,6,7,8,9,10,11,12))
Console output:
Error in is_logical(.p) : objeto 'status' no encontrado
Is there any other library that I could use to accomplish this task?
Upvotes: 0
Views: 61
Reputation: 388807
To filter quarterly months for "Active"
status and all months for "Inactive" you could do :
library(dplyr)
df %>%
mutate(month = lubridate::month(mes)) %>%
filter(status == "Active" & month %in% c(3,6,9,12) |
status == "Inactive" & month %in% 1:12)
# mes status month
#1 2000-02-01 Inactive 2
#2 2000-03-01 Active 3
#3 2000-04-01 Inactive 4
#4 2000-06-01 Inactive 6
#5 2000-09-01 Inactive 9
#6 2000-11-01 Inactive 11
#7 2000-12-01 Active 12
Since you want all months for "Inactive" status you can also do :
df %>%
mutate(month = lubridate::month(mes)) %>%
filter(status == "Active" & month %in% c(3,6,9,12) |
status == "Inactive")
Upvotes: 1
Reputation: 886938
With dplyr::filter
, if we use ,
, then it means &
, instead, we need |
. Using &
would result in 0 rows
because 'status' can't have both 'Active' and 'Inactive' at the same location
df %>%
dplyr::filter(status %in% "Active"| status %in% "Inactive") %>%
dplyr::filter(status == 'Active', month(mes) %in% c(3, 6, 9, 12))
Also, as we are using %in%
, it can take a vector
of values in the rhs of the operator %in%
with length
>= 1
df %>%
dplyr::filter(status %in% c("Active", "Inactive")) %>%
dplyr::filter(status == 'Active', month(mes) %in% c(3, 6, 9, 12))
In the OP's filter statement
...
month(mes) %in% c(3,6,9,12),
month(mes) %in% c(1,2,3,4,5,6,7,8,9,10,11,12)
implies both conditions should be true, but one of them is a subset of the another condition
Upvotes: 2