d_a_c321
d_a_c321

Reputation: 533

Is there any way to get postgresql to report results from a join?

In other statistical softwares (STATA), when you perform a join between two separate tables there are options to reports the results of a join

For instance, if you join a table with another table on a column and the second table has non-unique values, it reports that.

Likewise, if you perform an inner join it reports the number of rows dropped from both tables and if you perform a left or right outer join it lets you know how many rows were unmatched.

Upvotes: 2

Views: 87

Answers (2)

wildplasser
wildplasser

Reputation: 44250

It will need a nasty outer join. Here is the CTE version:

-- Some data
CREATE TABLE bob
        ( ID INTEGER NOT NULL
        , zname varchar
        );
INSERT INTO bob(id, zname) VALUES
 (2, 'Alice') ,(3, 'Charly')
,(4,'David') ,(5, 'Edsger') ,(6, 'Fanny')
        ;
CREATE TABLE john
        ( ID INTEGER NOT NULL
        , zname varchar
        );
INSERT INTO john(id, zname) VALUES
 (4,'David') ,(5, 'Edsger') ,(6, 'Fanny')
,(7,'Gerard') ,(8, 'Hendrik') ,(9, 'Irene'), (10, 'Joop')
        ;

 --
 -- Encode presence in bob as 1, presence in John AS 2, both=3
 --
WITH flags AS (
        WITH b AS (
                SELECT 1::integer AS flag, id
                FROM bob
                )
        , j AS (
                SELECT 2::integer AS flag, id
                FROM john
                )
        SELECT COALESCE(b.flag, 0) + COALESCE(j.flag, 0) AS flag
        FROM b
        FULL OUTER JOIN j ON b.id = j.id
        )
SELECT flag, COUNT(*)
FROM flags
GROUP BY flag;

The result:

CREATE TABLE
INSERT 0 5
CREATE TABLE
INSERT 0 7
 flag | count 
------+-------
    1 |     2
    3 |     3
    2 |     4
(3 rows)

Upvotes: 1

Wolph
Wolph

Reputation: 80091

As far as I know there is no option to do that within Postgres, although you could get a guess by looking at the estimates.

Calculating the missing rows requires you to count all rows so databases generally try to avoid things like that.

The options I can think of:

  • writing multiple queries
  • doing a full outer join and filtering the results (maybe with a subquery... can't think of a good way which will always easily work)
  • use writable complex table expressions to log the intermediate results

Upvotes: 0

Related Questions