Dave Colland
Dave Colland

Reputation: 37

How to group and collect data in PostgreSQL by date periods?

Here is a demo data:

create table Invoices (
    id INT,
    name VARCHAR,
    customer_id INT,
    total_amount FLOAT,
    state VARCHAR,
    invoice_date DATE
);

INSERT INTO Invoices
    (id, name, customer_id, total_amount, state, invoice_date)
VALUES
    (1,  'INV/2020/0001', 2, 100, 'posted', '2020-04-05'),
    (2,  'INV/2020/0002', 1, 100, 'draft',  '2020-04-05'),
    (3,  'INV/2020/0003', 2, 100, 'draft',  '2020-05-24'),
    (4,  'INV/2020/0004', 1, 100, 'posted', '2020-05-25'),
    (5,  'INV/2020/0005', 2, 100, 'posted', '2020-06-05'),
    (6,  'INV/2020/0006', 1, 100, 'posted', '2020-07-05'),
    (7,  'INV/2020/0007', 1, 100, 'draft',  '2020-08-24'),
    (8,  'INV/2020/0008', 1, 100, 'posted', '2020-08-25'),
    (9,  'INV/2020/0009', 1, 100, 'posted', '2020-09-05'),
    (10, 'INV/2020/0010', 1, 100, 'draft',  '2020-09-05'),
    (11, 'INV/2020/0011', 2, 100, 'draft',  '2020-10-24'),
    (12, 'INV/2020/0012', 1, 100, 'posted', '2020-10-25'),
    (13, 'INV/2020/0013', 2, 100, 'posted', '2020-11-05'),
    (14, 'INV/2020/0014', 1, 100, 'posted', '2020-11-05'),
    (15, 'INV/2020/0015', 2, 100, 'draft',  '2020-11-24'),
    (16, 'INV/2020/0016', 1, 100, 'posted', '2020-11-25')

I have a query that computes a sum of all posted invoices for customer with id = 1

SELECT sum(total_amount), customer_id 
    FROM Invoices 
        WHERE state = 'posted' AND customer_id = 1  
    GROUP BY customer_id

I need to group the data (sum(total_amount)) by 3 time periods - 2 or 3 months each (2 or 3 needs to be able to change by changing the number in the query. I want to pass it as a parameter to the query from python code). Also I need to get the average sums of the period. Can you help me please?

Expected output for period = 2 months is:

+--------------+--------------+--------------+--------+
| Period_1_sum | Period_2_sum | Period_3_sum |  Avg   |
+--------------+--------------+--------------+--------+
|          300 |          300 |          100 | 233.33 |
+--------------+--------------+--------------+--------+ 

Upvotes: 2

Views: 122

Answers (1)

user330315
user330315

Reputation:

You can use conditional aggregation for that:

SELECT customer_id, 
       sum(total_amount) as total_amount,
       sum(total_amount) filter (where invoice_date >= date '2020-04-01' and invoice_date < date '2020-07-01') as period_1_sum,
       sum(total_amount) filter (where invoice_date >= date '2020-07-01' and invoice_date < date '2020-10-01') as period_2_sum,
       sum(total_amount) filter (where invoice_date >= date '2020-10-01' and invoice_date < date '2021-01-01') as period_3_sum
FROM Invoices 
WHERE state = 'posted' 
GROUP BY customer_id

By changing the filter condition you can control which rows are aggregated for each period.

Online example

Upvotes: 1

Related Questions