user866364
user866364

Reputation:

Redshift: replace FULL OUTER for a CROSS JOIN

I would like to perform a full outer join using multiple OR values but i've read that PostgreSQL can only do a full outer join in a situation where the join conditions are distinct on each side of the = sign.

In my scenario, I have 2 tables: ticket and production. One register on Ticket can have a few values for Production.code. Example:

TICKET|custom_field_1|custom_field_2|custom_field_3
    1| 10            |9             | 
    2|               |8             | 

PRODUCTION|CODE
         1| 10
         5| 8  
        12| 9               

In the following example, Ticket ID 1 is related with Production Code 9 and 10. And Ticket ID 2 is related with Production Code 8.

I'm trying to write a query to return column Status from table Production:

SELECT 
  production.status  
FROM ticket 
FULL OUTER JOIN production ON ticket.custom_field_1 = production.code
            OR ticket.custom_field_2 = production.code
            OR ticket.custom_field_3 = production.code
GROUP BY 1
ORDER BY 1 
LIMIT 1000

When I try to run this query, I got an error: Invalid operation: FULL JOIN is only supported with merge-joinable join conditions;

So I've started to replace it for a CROSS JOIN. The query is almost working but I'm facing a difference number of rows:

SELECT count(production.id) FROM ticket
CROSS JOIN production
WHERE date(production.ts_real) >= '2019-03-01' AND
      ((ticket.custom_field_1 = sisweb_producao.proposta) OR
       (ticket.custom_field_2 = sisweb_producao.proposta) OR
       (ticket.custom_field_3 = sisweb_producao.proposta));

This query above should return 202 rows but only gives 181 because of my conditions. How can i make the cross join works like a FULL OUTER?

I'm using a tool called Looker, that's why I'm building this query on this way.

Upvotes: 1

Views: 5429

Answers (1)

Nathan Griffiths
Nathan Griffiths

Reputation: 12756

It's not quite clear what the schema of your tables is as some of your example SQL contains columns not in the example schema, but it looks like you could use an alternative approach of pivoting the ticket columns and joining them to the production table using an inner join to achieve the same thing:

SELECT 
    t1.ticket
    , production.id
    , production.status
FROM 
(
    SELECT 
        ticket
        , custom_field_1 AS code
    FROM 
        ticket
    WHERE 
        custom_field_1 IS NOT NULL

    UNION

    SELECT 
        ticket
        , custom_field_2 AS code
    FROM 
        ticket
    WHERE 
        custom_field_2 IS NOT NULL

    UNION

    SELECT 
        ticket
        , custom_field_3 AS code
    FROM 
        ticket
    WHERE 
        custom_field_3 IS NOT NULL
) t1
INNER JOIN 
    production ON t1.code = production.code

Based on the example data you provided, it looks like a ticket can be related to more than one production code, and hence more than one "status", so whichever way you do this be aware you will potentially have multiple result rows per ticket.

Upvotes: 3

Related Questions