Saturnix
Saturnix

Reputation: 10564

Sum of column when grouped by another column, but with different indexes

I have a table like this:

date                total_car_crashes   city_id
01.01               1                   Washington
01.02               1                   Washington
01.03               2                   Washington
01.04               2                   Washington
01.05               2                   Washington
01.06               3                   Washington
01.07               4                   Washington
01.08               4                   Washington
01.01               1                   Detroit
01.02               2                   Detroit
01.03               2                   Detroit
01.04               2                   Detroit
01.05               2                   Detroit
01.06               3                   Detroit
01.07               4                   Detroit

total_car_crashes is a cumulative counter of the number of car crashes for a given city, up to a given date.

I need a time serie with the cumulative counter of car crashes for the entire nation.

What I'm doing right now is aggregating the table so as to have a unique time serie with the total number of events happened.

SELECT
  SUM(total_car_crashes) AS total_car_crashes,
  "date"
FROM Table
GROUP BY "date"

This works fine, as long as every time serie has the same start/finish date. However, if one series goes to 01.08, the event counter for all the others is (as expected but not wanted) zero, rather than the max it reached.

01.01               2
01.02               3
01.03               4
01.04               4
01.05               4
01.06               6
01.07               8
01.08               4 // -> should be 8

I'd like the events_counter to keep going, rather counting as zero where not available.

Is there any way I can ask sql to use the max value of total_car_crashes for a time serie where date is not available?

I thought about using a cumulative sum, but total_car_crashes is already cumulative!

Upvotes: 0

Views: 36

Answers (1)

forpas
forpas

Reputation: 164064

Your requirement is complicated and so is the solution.
You need distinct dates and city_ids to be joined to the table and the maximum values of total_car_crashes:

select 
  d.date,
  sum(coalesce(t.total_car_crashes, m.max_car_crashes)) total_car_crashes
from (
  select * from (select distinct date from tablename)
  cross join (select distinct city_id from tablename)
) d
inner join (
  select date, max(total_car_crashes) max_car_crashes 
  from tablename
  group by date
) m on m.date = d.date
left join tablename t on t.date = d.date and t.city_id = d.city_id
group by d.date

See the demo.
Results:

| date  | total_car_crashes |
| ----- | ----------------- |
| 01.01 | 2                 |
| 01.02 | 3                 |
| 01.03 | 4                 |
| 01.04 | 4                 |
| 01.05 | 4                 |
| 01.06 | 6                 |
| 01.07 | 8                 |
| 01.08 | 8                 |

Upvotes: 2

Related Questions