LLawsford
LLawsford

Reputation: 61

Cannot filter column when name of that column comes from variable

As default I set the argument cut.points as NA and if it's on default then it shouldn't do anything with the data.

But if user decides to put for example cut.points = c("2012-01-01", "2013-01-01") then the data should be filtered by the column that has dates in it. And it should return only dates between 2012 to 2013. The problem is that I'm reading data from the function so in theory i won't know what is the name of this date column that uses provides. So i find the column with dates and store it's name in the variable.

But the condition which i wrote that should filter based od this variable doesn't work:

modifier <- function(input.data, cut.points = c(NA, NA)) {
  date_check <- sapply(input.data, function(x) !all(is.na(as.Date(as.character(x),format="%Y-%m-%d"))))
  if (missing(cut.points)) {
    input.data
  } else {
    cols <- colnames(select_if(input.data, date_check == TRUE))
    cut.points <- as.Date(cut.points)
    input.data <- filter(input.data, cols > cut.points[1] & cols < cut.points[2])
  }
}

for ex. when i try to run this:

modifier(ex_data, cut.points = c("2012-01-01", "2013-01-01"))

On sample like this:

    ex_data
   Row.ID        Order.ID Order.Date
1   32298  CA-2012-124891 2012-07-31
2   26341   IN-2013-77878 2013-02-05
3   25330   IN-2013-71249 2013-10-17
4   13524 ES-2013-1579342 2013-01-28
5   47221    SG-2013-4320 2013-11-05
6   22732   IN-2013-42360 2013-06-28
7   30570   IN-2011-81826 2011-11-07
8   31192   IN-2012-86369 2012-04-14
9   40155  CA-2014-135909 2014-10-14
10  40936  CA-2012-116638 2012-01-28
11  34577  CA-2011-102988 2011-04-05
12  28879   ID-2012-28402 2012-04-19
13  45794    SA-2011-1830 2011-12-27
14   4132  MX-2012-130015 2012-11-13
15  27704   IN-2013-73951 2013-06-06
16  13779 ES-2014-5099955 2014-07-31
17  36178  CA-2014-143567 2014-11-03
18  12069 ES-2014-1651774 2014-09-08
19  22096   IN-2014-11763 2014-01-31
20  49463    TZ-2014-8190 2014-12-05

the error is: character string is not in a standard unambiguous format

Upvotes: 1

Views: 141

Answers (1)

csgroen
csgroen

Reputation: 2541

I've added lubridateas a dependency so I could get access to %within% and is.Date. I've also changed the check condition, because I don't think your original one would work with NA, NA.

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

ex_data <- read_table("   Row.ID        Order.ID Order.Date
1   32298  CA-2012-124891 2012-07-31
2   26341   IN-2013-77878 2013-02-05
3   25330   IN-2013-71249 2013-10-17
4   13524 ES-2013-1579342 2013-01-28
5   47221    SG-2013-4320 2013-11-05
6   22732   IN-2013-42360 2013-06-28
7   30570   IN-2011-81826 2011-11-07
8   31192   IN-2012-86369 2012-04-14
9   40155  CA-2014-135909 2014-10-14
10  40936  CA-2012-116638 2012-01-28
11  34577  CA-2011-102988 2011-04-05
12  28879   ID-2012-28402 2012-04-19
13  45794    SA-2011-1830 2011-12-27
14   4132  MX-2012-130015 2012-11-13
15  27704   IN-2013-73951 2013-06-06
16  13779 ES-2014-5099955 2014-07-31
17  36178  CA-2014-143567 2014-11-03
18  12069 ES-2014-1651774 2014-09-08
19  22096   IN-2014-11763 2014-01-31
20  49463    TZ-2014-8190 2014-12-05")
#> Warning: Missing column names filled in: 'X1' [1]

modifier <- function(input.data, cut.points = NULL) {
    if (length(cut.points) == 2) {
        date_col <- colnames(input.data)[sapply(input.data, is.Date)]
        filtered.data <- input.data %>%
            rename(Date = !! date_col) %>%
            filter(Date %within% interval(cut.points[1], cut.points[2])) %>%
            rename_with(~ date_col, Date)
        return(filtered.data)
    } else {
        input.data
    }
}
modifier(ex_data, cut.points = c("2012-01-01", "2013-01-01"))
#> # A tibble: 5 x 4
#>      X1 Row.ID Order.ID       Order.Date
#>   <dbl>  <dbl> <chr>          <date>    
#> 1     1  32298 CA-2012-124891 2012-07-31
#> 2     8  31192 IN-2012-86369  2012-04-14
#> 3    10  40936 CA-2012-116638 2012-01-28
#> 4    12  28879 ID-2012-28402  2012-04-19
#> 5    14   4132 MX-2012-130015 2012-11-13

Upvotes: 1

Related Questions