DarioB
DarioB

Reputation: 1609

Google bigquery subtract any number of business day from date

I am trying to build a function that can subtract any arbitrary number of business days from a date. So far I came out with this

CREATE TEMPORARY FUNCTION working_days_diff(the_date DATE, num_of_days INT64) AS
(
  CASE WHEN EXTRACT(DAYOFWEEK FROM the_date) in (2,3,4,5,6,7)
    THEN
      CASE WHEN (EXTRACT(DAYOFWEEK FROM the_date) - num_of_days) > 1
      THEN DATE_SUB(the_date, INTERVAL num_of_days DAY)
      ELSE DATE_SUB(the_date, INTERVAL (num_of_days +2) DAY)
      END
    ELSE
      DATE_SUB(the_date, INTERVAL (num_of_days +1) DAY)
  END
);

This works well when the number of days to subtract is less than a week apart, like the following cases :

 select working_days_diff(DATE("2018-04-12"), 3)
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 4)
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 5)
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 6)
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 7)
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 8)

But when the number of days I want to subtract gets bigger, it fails. For example:

 select working_days_diff(DATE("2018-04-12"), 9) -- this should return "2018-03-30"
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 10) -- this should return "2018-03-29"
 UNION ALL
 select working_days_diff(DATE("2018-04-12"), 20) -- this should return "2018-03-15"

this question is related to my previous question: bigquery subtract 3 business days from date

Upvotes: 2

Views: 4103

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

As you already got comfortable with version that correctly process some cases but not all - I thought of coming with version that will be as least different from what you already have as possible (of course assuming that it is correct for those cases that you described in your question - so I just extend it to the rest of cases)

So, first - in your function you replace all entries of num_of_days with MOD(num_of_days, 5) - by doing this you kind of ignoring all (if any) full working weeks (5 working days) - for example instead of 9 days you calculate respective day for just 1 day, for 10 and 20 - 0 days and so on

Now you need to "jump" back from that "intermediate" day for that many weeks as you "ignored" in previous step
For this you "embracing" previous result into - DATE_SUB(... , INTERVAL DIV(num_of_days, 5) WEEK)

That's all!

So final version will be as below

  #standardSQL
  CREATE TEMPORARY FUNCTION working_days_diff(the_date DATE, num_of_days INT64) AS
  (
    DATE_SUB(CASE WHEN EXTRACT(DAYOFWEEK FROM the_date) IN (2,3,4,5,6,7)
      THEN
        CASE WHEN (EXTRACT(DAYOFWEEK FROM the_date) - MOD(num_of_days, 5)) > 1
        THEN DATE_SUB(the_date, INTERVAL MOD(num_of_days, 5) DAY)
        ELSE DATE_SUB(the_date, INTERVAL (MOD(num_of_days, 5) + 2) DAY)
        END
      ELSE
        DATE_SUB(the_date, INTERVAL (MOD(num_of_days, 5) +1) DAY)
    END, INTERVAL DIV(num_of_days, 5) WEEK) 
  );

You can test / play with it using your examples in question

  #standardSQL
  CREATE TEMPORARY FUNCTION working_days_diff(the_date DATE, num_of_days INT64) AS
  (
    DATE_SUB(CASE WHEN EXTRACT(DAYOFWEEK FROM the_date) IN (2,3,4,5,6,7)
      THEN
        CASE WHEN (EXTRACT(DAYOFWEEK FROM the_date) - MOD(num_of_days, 5)) > 1
        THEN DATE_SUB(the_date, INTERVAL MOD(num_of_days, 5) DAY)
        ELSE DATE_SUB(the_date, INTERVAL (MOD(num_of_days, 5) + 2) DAY)
        END
      ELSE
        DATE_SUB(the_date, INTERVAL (MOD(num_of_days, 5) +1) DAY)
    END, INTERVAL DIV(num_of_days, 5) WEEK) 
  );
  SELECT working_days_diff(DATE("2018-04-12"), 3) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 4) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 5) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 6) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 7) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 8) UNION ALL
  SELECT working_days_diff(DATE("2018-04-12"), 9) UNION ALL-- this should return "2018-03-30"
  SELECT working_days_diff(DATE("2018-04-12"), 10) UNION ALL-- this should return "2018-03-29"
  SELECT working_days_diff(DATE("2018-04-12"), 20) -- this should return "2018-03-15"

result is now as expected

Row f0_  
1   2018-04-09   
2   2018-04-06   
3   2018-04-05   
4   2018-04-04   
5   2018-04-03   
6   2018-04-02   
7   2018-03-30   
8   2018-03-29   
9   2018-03-15   

I feel - this can be further optimized a little - but my target was to not do do so - but rather leave it is close to what you already got as possible - so it will be easier for you to absorb and further modify and use as needed in real use cases

Upvotes: 3

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520938

Appreciate that subtracting a unit of 7 days, namely a week, will always result in subtracting only 5 business days. This is because a 7 day period, starting on any day, will always cross an entire weekend. One approach to your problem would be to subtract off full weeks, and then use a CASE expression to handle the remainder days, as you are already doing.

CREATE TEMPORARY FUNCTION working_days_diff(the_date DATE, num_of_days INT64) AS
(
    CASE WHEN EXTRACT(DAYOFWEEK FROM the_date) in (2,3,4,5,6,7)
         THEN
         CASE WHEN (EXTRACT(DAYOFWEEK FROM the_date) - MOD(num_of_days, 7)) > 1
              THEN DATE_SUB(the_date, INTERVAL ((7*num_of_days)/5 + MOD(num_of_days, 7)) DAY)
              ELSE DATE_SUB(the_date, INTERVAL ((7*num_of_days)/5) + MOD(num_of_days, 7) + 2) DAY) END
         ELSE
             DATE_SUB(the_date, INTERVAL ((7*num_of_days)/5 + MOD(num_of_days, 7) + 1) DAY) END
);

Note: You may have to cast my formulas to integer type using something like:

DATE_SUB(the_date, INTERVAL (CAST((7*num_of_days)/5) + MOD(num_of_days, 7) + 2) AS INTEGER) DAY)

Upvotes: 2

Related Questions