Reputation: 61
As default I set the argument cut.points
as NA and if it's on default then it shouldn't do anything with the data.
But if user decides to put for example cut.points = c("2012-01-01", "2013-01-01")
then the data should be filtered by the column that has dates in it. And it should return only dates between 2012 to 2013.
The problem is that I'm reading data from the function so in theory i won't know what is the name of this date column that uses provides. So i find the column with dates and store it's name in the variable.
But the condition which i wrote that should filter based od this variable doesn't work:
modifier <- function(input.data, cut.points = c(NA, NA)) {
date_check <- sapply(input.data, function(x) !all(is.na(as.Date(as.character(x),format="%Y-%m-%d"))))
if (missing(cut.points)) {
input.data
} else {
cols <- colnames(select_if(input.data, date_check == TRUE))
cut.points <- as.Date(cut.points)
input.data <- filter(input.data, cols > cut.points[1] & cols < cut.points[2])
}
}
for ex. when i try to run this:
modifier(ex_data, cut.points = c("2012-01-01", "2013-01-01"))
On sample like this:
ex_data
Row.ID Order.ID Order.Date
1 32298 CA-2012-124891 2012-07-31
2 26341 IN-2013-77878 2013-02-05
3 25330 IN-2013-71249 2013-10-17
4 13524 ES-2013-1579342 2013-01-28
5 47221 SG-2013-4320 2013-11-05
6 22732 IN-2013-42360 2013-06-28
7 30570 IN-2011-81826 2011-11-07
8 31192 IN-2012-86369 2012-04-14
9 40155 CA-2014-135909 2014-10-14
10 40936 CA-2012-116638 2012-01-28
11 34577 CA-2011-102988 2011-04-05
12 28879 ID-2012-28402 2012-04-19
13 45794 SA-2011-1830 2011-12-27
14 4132 MX-2012-130015 2012-11-13
15 27704 IN-2013-73951 2013-06-06
16 13779 ES-2014-5099955 2014-07-31
17 36178 CA-2014-143567 2014-11-03
18 12069 ES-2014-1651774 2014-09-08
19 22096 IN-2014-11763 2014-01-31
20 49463 TZ-2014-8190 2014-12-05
the error is: character string is not in a standard unambiguous format
Upvotes: 1
Views: 141
Reputation: 2541
I've added lubridate
as a dependency so I could get access to %within%
and is.Date
. I've also changed the check condition, because I don't think your original one would work with NA, NA
.
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
ex_data <- read_table(" Row.ID Order.ID Order.Date
1 32298 CA-2012-124891 2012-07-31
2 26341 IN-2013-77878 2013-02-05
3 25330 IN-2013-71249 2013-10-17
4 13524 ES-2013-1579342 2013-01-28
5 47221 SG-2013-4320 2013-11-05
6 22732 IN-2013-42360 2013-06-28
7 30570 IN-2011-81826 2011-11-07
8 31192 IN-2012-86369 2012-04-14
9 40155 CA-2014-135909 2014-10-14
10 40936 CA-2012-116638 2012-01-28
11 34577 CA-2011-102988 2011-04-05
12 28879 ID-2012-28402 2012-04-19
13 45794 SA-2011-1830 2011-12-27
14 4132 MX-2012-130015 2012-11-13
15 27704 IN-2013-73951 2013-06-06
16 13779 ES-2014-5099955 2014-07-31
17 36178 CA-2014-143567 2014-11-03
18 12069 ES-2014-1651774 2014-09-08
19 22096 IN-2014-11763 2014-01-31
20 49463 TZ-2014-8190 2014-12-05")
#> Warning: Missing column names filled in: 'X1' [1]
modifier <- function(input.data, cut.points = NULL) {
if (length(cut.points) == 2) {
date_col <- colnames(input.data)[sapply(input.data, is.Date)]
filtered.data <- input.data %>%
rename(Date = !! date_col) %>%
filter(Date %within% interval(cut.points[1], cut.points[2])) %>%
rename_with(~ date_col, Date)
return(filtered.data)
} else {
input.data
}
}
modifier(ex_data, cut.points = c("2012-01-01", "2013-01-01"))
#> # A tibble: 5 x 4
#> X1 Row.ID Order.ID Order.Date
#> <dbl> <dbl> <chr> <date>
#> 1 1 32298 CA-2012-124891 2012-07-31
#> 2 8 31192 IN-2012-86369 2012-04-14
#> 3 10 40936 CA-2012-116638 2012-01-28
#> 4 12 28879 ID-2012-28402 2012-04-19
#> 5 14 4132 MX-2012-130015 2012-11-13
Upvotes: 1