Peete
Peete

Reputation: 125

SQL: Count of rows since certain value first occurred

In SQL Server, I'm trying to calculate the count of days since the same weather as today (let's assume today is 6th August 2018) was observed first in the past 5 days. Per town.

Here's the data:

+---------+---------+--------+--------+--------+
| Date    | Toronto | Cairo  | Zagreb | Ankara |
+---------+---------+--------+--------+--------+
| 1.08.18 | Rain    | Sun    | Clouds | Sun    |
| 2.08.18 | Sun     | Sun    | Clouds | Sun    |
| 3.08.18 | Rain    | Sun    | Clouds | Rain   |
| 4.08.18 | Clouds  | Sun    | Clouds | Clouds |
| 5.08.18 | Rain    | Clouds | Rain   | Rain   |
| 6.08.18 | Rain    | Sun    | Sun    | Sun    |
+---------+---------+--------+--------+--------+

This needs to perform well but all I came up with so far is single queries for each town (and there are going to be dozens of towns, not just the four). This works but is not going to scale.

Here's the one for Toronto...

SELECT 
    DATEDIFF(DAY, MIN([Date]), GETDATE()) + 1 
FROM
    (SELECT TOP 5 * 
     FROM Weather 
     WHERE [Date] <= GETDATE()
     ORDER BY [Date] DESC) a
WHERE 
    Toronto = (SELECT TOP 1 Toronto 
               FROM Weather
               WHERE DataDate = GETDATE())

...which correctly returns 4 since today there is rain and the first occurrence of rain within the past 5 days was 3rd August.

But what I want returned is a table like this:

+---------+-------+--------+--------+
| Toronto | Cairo | Zagreb | Ankara |
+---------+-------+--------+--------+
| 4       | 5     | 1      | 5      |
+---------+-------+--------+--------+

How is this possible?

Upvotes: 0

Views: 182

Answers (2)

Paul Maxwell
Paul Maxwell

Reputation: 35573

You really don't want to be trying to do this on pivoted data, and while you state that the data isn't stored that way, you haven't shown us how you got to the pivot of cities as columns - which is a shame.

So, I have "unpivoted" the sample in a common table expression, then used an apply operator to count the prior occurrences of the same weather in 5 previous days. As it seems you know how to pivot I leave it to you to then pivot the final result.

with cte as (
        select
              date, city, weather
        FROM (
              SELECT * from mytable
             ) AS cp
        UNPIVOT (
                  Weather FOR City IN (Toronto, Cairo, Zagreb, Ankara)
            ) AS up
    )

select 
        date, city, weather, ca.prior
from cte
cross apply (
    select count(*) as prior
    from cte as prev 
    where prev.city = cte.city
    and prev.date between dateadd(day,-6,cte.date) and dateadd(day,-1,cte.date)
    and prev.weather = cte.weather
    ) ca

Using this sample data:

CREATE TABLE mytable(
   Date    date  NOT NULL
  ,Toronto VARCHAR(9) NOT NULL
  ,Cairo   VARCHAR(9) NOT NULL
  ,Zagreb  VARCHAR(9) NOT NULL
  ,Ankara  VARCHAR(9) NOT NULL
);
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180801','Rain','Sun','Clouds','Sun');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180802','Sun','Sun','Clouds','Sun');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180803','Rain','Sun','Clouds','Rain');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180804','Clouds','Sun','Clouds','Clouds');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180805','Rain','Clouds','Rain','Rain');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180806','Rain','Sun','Sun','Sun');

The query above produced this result:

+----+---------------------+---------+---------+-------+
|    |        date         |  city   | weather | prior |
+----+---------------------+---------+---------+-------+
|  1 | 01.08.2018 00:00:00 | Ankara  | Sun     |     0 |
|  2 | 02.08.2018 00:00:00 | Ankara  | Sun     |     1 |
|  3 | 03.08.2018 00:00:00 | Ankara  | Rain    |     0 |
|  4 | 04.08.2018 00:00:00 | Ankara  | Clouds  |     0 |
|  5 | 05.08.2018 00:00:00 | Ankara  | Rain    |     1 |
|  6 | 06.08.2018 00:00:00 | Ankara  | Sun     |     2 |
|  7 | 01.08.2018 00:00:00 | Cairo   | Sun     |     0 |
|  8 | 02.08.2018 00:00:00 | Cairo   | Sun     |     1 |
|  9 | 03.08.2018 00:00:00 | Cairo   | Sun     |     2 |
| 10 | 04.08.2018 00:00:00 | Cairo   | Sun     |     3 |
| 11 | 05.08.2018 00:00:00 | Cairo   | Clouds  |     0 |
| 12 | 06.08.2018 00:00:00 | Cairo   | Sun     |     4 |
| 13 | 01.08.2018 00:00:00 | Toronto | Rain    |     0 |
| 14 | 02.08.2018 00:00:00 | Toronto | Sun     |     0 |
| 15 | 03.08.2018 00:00:00 | Toronto | Rain    |     1 |
| 16 | 04.08.2018 00:00:00 | Toronto | Clouds  |     0 |
| 17 | 05.08.2018 00:00:00 | Toronto | Rain    |     2 |
| 18 | 06.08.2018 00:00:00 | Toronto | Rain    |     3 |
| 19 | 01.08.2018 00:00:00 | Zagreb  | Clouds  |     0 |
| 20 | 02.08.2018 00:00:00 | Zagreb  | Clouds  |     1 |
| 21 | 03.08.2018 00:00:00 | Zagreb  | Clouds  |     2 |
| 22 | 04.08.2018 00:00:00 | Zagreb  | Clouds  |     3 |
| 23 | 05.08.2018 00:00:00 | Zagreb  | Rain    |     0 |
| 24 | 06.08.2018 00:00:00 | Zagreb  | Sun     |     0 |
+----+---------------------+---------+---------+-------+

For count of days since the first occurrence (within the past 5 days)

select 
        date, city, weather, datediff(day,ca.prior,cte.date) as prior
from cte
cross apply (
    select min(prev.date) as prior
    from cte as prev 
    where prev.city = cte.city
    and prev.date between dateadd(day,-6,cte.date) and dateadd(day,-1,cte.date)
    and prev.weather = cte.weather
    ) ca

Upvotes: 2

Ilyes
Ilyes

Reputation: 14928

I think you want something like

CREATE TABLE T
(
  [Date] DATE,
  Toronto VARCHAR(45),
  Cairo VARCHAR(45),
  Zagreb VARCHAR(45),
  Ankara VARCHAR(45)
);

INSERT INTO T VALUES
('2018-08-01', 'Rain', 'Sun', 'Clouds', 'Sun'),
('2018-08-02', 'Sun', 'Sun', 'Clouds', 'Sun'),
('2018-08-03', 'Rain', 'Sun', 'Clouds', 'Rain'),
('2018-08-04', 'Clouds', 'Sun', 'Clouds', 'Clouds'),
('2018-08-05', 'Rain', 'Clouds', 'Rain', 'Rain'),
('2018-08-06', 'Rain', 'Sun', 'Sun', 'Sun');

SELECT 
  (SELECT MAX(Occ) FROM (SELECT COUNT(Toronto) Occ FROM T WHERE Toronto = (select top 1 toronto from t order by date desc) GROUP BY Toronto) T) Toronto,
  (SELECT MAX(Occ) FROM (SELECT COUNT(Cairo) Occ FROM T WHERE Cairo = (select top 1 Cairo from t order by date desc) GROUP BY Cairo) T) Cairo,
  (SELECT MAX(Occ) FROM (SELECT COUNT(Zagreb) Occ FROM T WHERE Zagreb = (select top 1 Zagreb from t order by date desc)GROUP BY Zagreb) T) Zagreb,
  (SELECT MAX(Occ) FROM (SELECT COUNT(Ankara) Occ FROM T WHERE Ankara = (select top 1 Ankara from t order by date desc)GROUP BY Ankara) T) Ankara

Returns

+----+---------+-------+--------+--------+
|    | Toronto | Cairo | Zagreb | Ankara |
+----+---------+-------+--------+--------+
|  1 |       4 |     5 |      1 |      3 |
+----+---------+-------+--------+--------+

Demo

Upvotes: 1

Related Questions