Join after Group by performance

Question

Join tables and then group by multiple columns (like title) or group rows in sub-query and then join other tables? Is the second method slow because of lack of indexes after grouping? Should I order rows manually for second method to trigger merge join instead of nested loop? How to do it properly?

This is the first method. Became quite a mess cause of contragent_title and product_title are required to be in group by for strict mode. And I work with strict group by mode only.

SELECT
    s.contragent_id,
    s.contragent_title,
    s.product_id AS sort_id,
    s.product_title AS sort_title,
    COALESCE(SUM(s.amount), 0) AS amount,
    COALESCE(SUM(s.price), 0) AS price,
    COALESCE(SUM(s.discount), 0) AS discount,
    COUNT(DISTINCT s.product_id) AS sorts_count,
    COUNT(DISTINCT s.contragent_id) AS contragents_count,
    dd.date,
    ~grouping(dd.date, s.contragent_id, s.product_id) :: bit(3) AS mask
FROM date_dimension dd
LEFT JOIN (
    SELECT 
        s.id, 
        s.created_at,
        s.contragent_id, 
        ca.title AS contragent_title,
        p.id AS product_id, 
        p.title AS product_title,
        sp.amount, 
        sp.price, 
        sp.discount
    FROM sales s
    LEFT JOIN sold_products sp 
        ON s.id = sp.sale_id
    LEFT JOIN products p 
        ON sp.product_id = p.id
    LEFT JOIN contragents ca 
        ON s.contragent_id = ca.id
    WHERE s.created_at BETWEEN :caf AND :cat
        AND s.plant_id = :plant_id
        AND (s.is_cache = :is_cache OR :is_cache IS NULL)
        AND (sp.product_id = :sort_id OR :sort_id IS NULL)
) s ON dd.date = date(s.created_at)                
WHERE (dd.date BETWEEN :caf AND :cat)
GROUP BY GROUPING SETS (
    (dd.date, s.contragent_id, s.contragent_title, s.product_id, s.product_title),
    (dd.date, s.contragent_id, s.contragent_title),
    (dd.date)
)

Join after Group by performance

Answers (1)

Join, then aggregate:

Aggregate then join:

Question

Thinking and guessing

Answer

Related Questions