Reputation: 1211
Background
I'm pretty new to SQL and to Postgres. I'm trying to get some basic descriptions of some data in a large table about car and motorcycle insurance claims -- call the table claims
. The data have ~15 columns, but only 2 are relevant here:
id | claim_date |
---------------+--------------|
001 | 2011-01-30 |
001 | 2012-04-12 |
001 | 2014-06-27 |
002 | 2017-09-03 |
002 | 2018-12-25 |
The Problem
I'd like to know the average interval of time that ID
s appear in this dataset. Were I to do this by hand, I'd subtract the earliest from the latest claim_date
for each ID
, sum them, and divide by the number of distinct ID
s (2, in this case). This'd give me a result of 861
days here.
The main issue is that I'm not really sure how to write this out in query form in Postgres.
What I've Tried
In similar posts, other users have asked about subtracting the date in one date
column from another to get a date interval, but that's not exactly what I'm after.
I've tried a query like this:
select max(claims.claim_date) - min(claims.claim_date) date_diff
from claims
group by ID
But only gets me part way, just a list of max
-min
intervals for each ID
. I've also tried using AVG
like so:
select AVG(max(claims.claim_date) - min(claims.claim_date))
from claims
group by ID
But here I get the error ERROR: aggregate function calls cannot be nested
.
Any ideas?
Upvotes: 0
Views: 662
Reputation: 37472
Just use two levels of aggregation.
SELECT avg(d) AS ad
FROM (SELECT max(claim_date)
- min(claim_date) AS d
FROM claims
GROUP BY id) AS x;
Upvotes: 2