TheBiscuit
TheBiscuit

Reputation: 103

Create multiple SUM columns with data from a date range, where one column combines data from a joined third table

I’m creating a query that is grouping by an 'entity' column and also is outputing 4 new columns.

Any guidance on how to make the below query possible? or guidance on a more efficient query(s)?
The actual query I'm building is in the second code block.

SELECT
    ent.entity,
    SUM(data.amount) curr_month /* when data.ds between 1st of current month AND current date)*/
    SUM(data.amount) prev_month /* when data.ds between 1st of previous month AND current date)*/
    (SUBTRACT prev_month sum from curr_month sum) movement,
    (
      SUM(prev_data.amount) /* when prev_data.ds between 1st day of Previous year AND current date)*/
      + SUM(data.amount) /* when data.ds between 1st day of Current year AND current date)*/
    ) previous_year
FROM entity_table ent
JOIN current_year_table data
    ON data.unit = ent.unit
JOIN previous_year_table prev_data
    ON prev_data.unit = ent.unit
GROUP BY
    entity;

My current implementation of the above query

This is without the 'previous_year_table' join and seems to work so far but feels hacky. Also I'm using prestodb syntax e.g. DATE_TRUNC, current_date etc

SELECT
    ent.entity,
    SUM(
        CASE
            WHEN data.ds BETWEEN CAST(DATE_TRUNC('month', current_date) AS VARCHAR)
                AND CAST(current_date AS VARCHAR) THEN data.amount_usd
            ELSE 0
        END
    ) curr_month,
    SUM(
        CASE
            WHEN data.ds BETWEEN CAST(
                (DATE_TRUNC('month', current_date) - INTERVAL '1' MONTH) AS VARCHAR
            ) AND CAST(current_date AS VARCHAR) THEN data.amount_usd
            ELSE 0
        END
    ) prev_month,
    (
        SUM(
            CASE
                WHEN data.ds BETWEEN CAST(DATE_TRUNC('month', current_date) AS VARCHAR)
                    AND CAST(current_date AS VARCHAR) THEN data.amount_usd
                ELSE 0
            END
        ) - SUM(
            CASE
                WHEN data.ds BETWEEN CAST(
                    (DATE_TRUNC('month', current_date) - INTERVAL '1' MONTH) AS VARCHAR
                ) AND CAST(current_date AS VARCHAR) THEN data.amount_usd
                ELSE 0
            END
        )
    ) movement
FROM entity_table ent
JOIN current_year_table data
    ON data.operating_unit = ent.operating_unit
GROUP BY
    entity;

Desired Query Output

entity current_month previous_month movement(curr - prev) previous_year
entity_1 20 40 -20 70
entity_2 10 50 -40 90

Query Input Tables

One to 'many' relationship between Entity_Table unit column and ####_Year_Table unit columns

Entity_Table

entity unit (join col)
entity_1 1
entity_2 2

Current_Year_Table

unit (join col) amount ds (datestamp string, curr year)
1 20 2022-05-21
1 20 2022-04-19
2 10 2022-05-20
2 40 2022-04-26

Previous_Year_Table

unit (join col) amount ds (datestamp string, prev year)
1 20 2021-08-29
2 30 2021-03-18
2 10 2021-01-21
1 10 2021-02-13

Upvotes: 0

Views: 66

Answers (1)

Guru Stron
Guru Stron

Reputation: 142233

You have quite strange formula for movement cause from this two :

curr_month /* when data.ds between 1st of current month AND current date)/
prev_month /
when data.ds between 1st of previous month AND current date)*/

movement will always be just some for the previous month (between 1st and last day of month) taken with negative sign.

Also based on data it seems that for previous_year you actually want sum of current and previous years.

Other than that your approach is fine, I would suggest just small tweaks using if's, subselect and using data functions to make it more readable:

--sample data
WITH entity_table  (entity, operating_unit) AS (
    VALUES  ('entity_1', 1),
    ('entity_2', 2)
),
current_year_table (operating_unit, amount_usd, ds) AS (
    VALUES  
(1, 20, '2022-05-21'),
(1, 20, '2022-04-19'),
(2, 10, '2022-05-20'),
(2, 40, '2022-04-26')
),
previous_year_table (operating_unit, amount_usd, ds) AS (
    VALUES  
(1, 20, '2021-08-29'),
(2, 30, '2021-03-18'),
(2, 10, '2021-01-21'),
(1, 10, '2021-02-13')
)

Query:

-- query
select entity,
    curr_month,
    prev_month,
    curr_month - prev_month as movement,
    total_current_year + previous_year as previous_year
from (
        select ent.entity,
            ent.operating_unit,
            sum(
                if(
                    DATE_TRUNC('month', date(data.ds)) = DATE_TRUNC('month', now()),
                    amount_usd,
                    0
                )
            ) curr_month,
            sum(
                if(
                    // use = instead of >= if you want actual diff between current and previous month
                    DATE_TRUNC('month', date(data.ds)) >= DATE_TRUNC('month', now()) - interval '1' month,
                    amount_usd,
                    0
                )
            ) prev_month,
            sum(amount_usd) total_current_year
        FROM entity_table ent
            JOIN current_year_table data ON data.operating_unit = ent.operating_unit
        GROUP BY entity,
            ent.operating_unit
    ) as current_year_data
    join (
        select operating_unit,
            sum(amount_usd) as previous_year
        from previous_year_table
        group by operating_unit
    ) previous_year_data on current_year_data.operating_unit = previous_year_data.operating_unit

Output:

entity curr_month prev_month movement previous_year
entity_1 20 40 -20 70
entity_2 10 50 -40 90

Upvotes: 1

Related Questions