noway
noway

Reputation: 2585

Counting only consecutive dates in MS Access

I'm trying to query to calculate consecutive overdue days for some customers. I also have a primary key identifying the customers.

Sample Table

Date (d.m.y)     Name 
01.01.2014       Alex
02.01.2014       Alex
03.01.2014       Alex
01.01.2014       Bianca
02.01.2014       Bianca
08.07.2014       Alex
09.07.2014       Alex
10.07.2014       Alex
11.07.2014       Alex

How can I differentiate the names and only count consecutive days when making a SELECT COUNT()?

Desired Result
Name     Overdue Day Count   Date
Alex          3              01.01.2014  <== The date is the first overdue date
Bianca        2              01.01.2014
Alex          4              08.07.2014

Upvotes: 3

Views: 2855

Answers (2)

Gord Thompson
Gord Thompson

Reputation: 123654

Queries that use [NOT] EXISTS can be slow, so here is a solution that might run a bit faster.

We start by creating a query that finds the starting date for each consecutive group of dates by name

SELECT t1.Date, t1.Name
FROM 
    T AS t1 
    LEFT JOIN 
    T AS t2 
        ON t1.Name=t2.Name 
            AND t1.Date=DateDiff("d",-1,t2.Date)
WHERE t2.Date IS NULL

It gives us

Date        Name  
----------  ------
2014-01-01  Alex  
2014-01-01  Bianca
2014-07-08  Alex  

We can link that query back to the main table with an unequal join

SELECT t3.Date, t3.Name
FROM
    T AS t3
    INNER JOIN
    (
        SELECT t1.Date, t1.Name
        FROM 
            T AS t1 
            LEFT JOIN 
            T AS t2 
                ON t1.Name=t2.Name 
                    AND t1.Date=DateDiff("d",-1,t2.Date)
        WHERE t2.Date IS NULL
    ) AS StartDates
        ON t3.Name=StartDates.Name AND t3.Date>=StartDates.Date

to produce the following, which repeats rows for each successive group

Date        Name  
----------  ------
2014-01-01  Alex  
2014-01-02  Alex  
2014-01-03  Alex  
2014-01-01  Bianca
2014-01-02  Bianca
2014-07-08  Alex  
2014-07-08  Alex  
2014-07-09  Alex  
2014-07-09  Alex  
2014-07-10  Alex  
2014-07-10  Alex  
2014-07-11  Alex  
2014-07-11  Alex  

so if we tweak that into an aggregation query we can assign a group number for each row in the original table

SELECT t3.Date, t3.Name, COUNT(*) AS GroupNo
FROM
    T AS t3
    INNER JOIN
    (
        SELECT t1.Date, t1.Name
        FROM 
            T AS t1 
            LEFT JOIN 
            T AS t2 
                ON t1.Name=t2.Name 
                    AND t1.Date=DateDiff("d",-1,t2.Date)
        WHERE t2.Date IS NULL
    ) AS StartDates
        ON t3.Name=StartDates.Name AND t3.Date>=StartDates.Date
GROUP BY t3.Date, t3.Name

resulting in

Date        Name    GroupNo
----------  ------  -------
2014-01-01  Alex          1
2014-01-02  Alex          1
2014-01-03  Alex          1
2014-01-01  Bianca        1
2014-01-02  Bianca        1
2014-07-08  Alex          2
2014-07-09  Alex          2
2014-07-10  Alex          2
2014-07-11  Alex          2

Finally, we can wrap the whole thing in another aggregation query

SELECT 
    Grouped.Name, 
    COUNT(*) AS DaysOverdue,
    MIN(Grouped.Date) AS OverdueSince
FROM
    (
        SELECT t3.Date, t3.Name, COUNT(*) AS GroupNo
        FROM
            T AS t3
            INNER JOIN
            (
                SELECT t1.Date, t1.Name
                FROM 
                    T AS t1 
                    LEFT JOIN 
                    T AS t2 
                        ON t1.Name=t2.Name 
                            AND t1.Date=DateDiff("d",-1,t2.Date)
                WHERE t2.Date IS NULL
            ) AS StartDates
                ON t3.Name=StartDates.Name AND t3.Date>=StartDates.Date
        GROUP BY t3.Date, t3.Name
    ) AS Grouped
GROUP BY Grouped.Name, Grouped.GroupNo
ORDER BY 3, 1

to produce the final result

Name    DaysOverdue  OverdueSince
------  -----------  ------------
Alex              3  2014-01-01  
Bianca            2  2014-01-01  
Alex              4  2014-07-08  

Upvotes: 2

GarethD
GarethD

Reputation: 69769

This is a gaps and islands problem (specifically islands) - Unfortunately I think the only solution supported in access is very inefficient:

SELECT  Name, 
        COUNT(*) AS Days,
        MIN(Date) AS FirstDate, 
        MAX(Date) AS LastDate
FROM    (   SELECT  Name,
                    Date,
                    (   SELECT  MIN(B.Date)
                        FROM    T AS B
                        WHERE   B.Date >= A.Date
                        AND     B.Name = A.Name
                        AND     NOT EXISTS
                                (   SELECT  1
                                    FROM    T AS C
                                    WHERE   C.Name = B.Name
                                    AND     C.Date = B.Date + 1
                                )
                    ) AS grp
            FROM    T AS A
        ) AS D
GROUP BY Name, grp;

There is a full explanation in the article linked above, but the sub-queries used to create the column grp find the end of each particular island, and then this value can be used to group the outer query.

Upvotes: 5

Related Questions