Number of absent rows in daterange

Question

I have a table with following structure

 transaction_id    user_id     date_column   
   1                  1        01-08-2011     
   2                  2        01-08-2011    
   3                  1        02-08-2011   
   4                  1        03-08-2011

There can be at-max only one entry for each user on each date.

How can get all rows where user_id is not present for specific date range. So for above table with user_id= 2 and date range 01-08-2011 to 03-08-2011, I want

result
02-08-2011
03-08-2011

Right now, I am using for loop to loop over all dates in given date range. This is working fine with small date range, but I think it will become resource heavy for large one.

Matt Fenwick · Accepted Answer

As suggested in a comment, create a table with the dates of interest (I'll call it datesofinterest). Every date from your date range needs to be put into this table.

datesofinterest table
--------------
date
--------------
01-08-2011
02-08-2011
03-08-2011

Then the datesofinterest table needs to be joined with all the userids -- this is the set of all possible combinations of dates-of-interest and userids.

Now you have to remove all those dates-of-interest/userids that are currently in your original table to get your final answer.

In relational algebra, it'd be something like:

(datesofinterest[date] x transaction[user_id]) - (transaction[date_column, user_id])

This page may help with translating '-' to SQL. Generating dates to populate the datesofinterest table can be done in SQL, manually, or with a helper program (perl's DateTime)

Number of absent rows in daterange

Answers (1)

Related Questions