michael
michael

Reputation: 3945

SQL INNER JOIN: ORDER BY LIMIT duplicates from one table

I have 2 tables without identical primary key between them (id exist in both, primary only for table A). I want to use the primary key of the first table A for the ON clause. Therefore I'll have duplicates from the second table B. I want to GROUP BY the duplicates based on some field B.cnt and always take the first one - DESC LIMIT 1.

This is what I tried (DBMS is PostgreSQL):

SELECT 
    scheme1.A.some_attr, 
    B.some_attr
FROM 
    (SELECT * FROM scheme2.B ORDER BY scheme2.B.cnt DESC LIMIT 1) AS B
INNER JOIN
    scheme1.A
ON
    scheme1.A.id = B.id
;

The query returns single record. While the desired behavior is to return single record just for each set of records from B having same id (based on the criteria mentioned). So in total the query of course will return multiple records...

How can I achieve the desired result?

Thanks,

Upvotes: 0

Views: 689

Answers (2)

Evaldas Buinauskas
Evaldas Buinauskas

Reputation: 14077

Your issue should be this line:

(SELECT * FROM scheme2.B ORDER BY scheme2.B.cnt DESC LIMIT 1) scheme2.B

It's being treated as following:

(SELECT * FROM scheme2.B ORDER BY scheme2.B.cnt DESC LIMIT 1) AS scheme2.B

Where scheme.B alias is obviously incorrect, change it following and it should work

SELECT 
    scheme1.A.some_attr, 
    scheme2.B.some_attr
FROM 
    (SELECT * FROM scheme2.B ORDER BY scheme2.B.cnt DESC LIMIT 1) AS B
INNER JOIN
    scheme1.A
ON
    scheme1.A.id = B.id
;

EDIT:

SELECT 
    scheme1.A.some_attr, 
    scheme2.B.some_attr
FROM 
    scheme1.A
LEFT JOIN LATERAL
    (SELECT * FROM scheme2.B WHERE scheme2.B.id = scheme2.A.id ORDER BY scheme2.B.cnt DESC LIMIT 1) AS B ON TRUE
;

If it's a single attribute, you could do the following:

SELECT 
    scheme1.A.some_attr, 
    (
        SELECT 
            scheme2.B.some_attr
        FROM
            scheme2.B
        WHERE
            scheme2.B.id = scheme2.A.id
        ORDER BY scheme2.B.cnt DESC LIMIT 1
    )
FROM 
    scheme1.A
;

Upvotes: 1

dovka
dovka

Reputation: 1061

Use rank() window analytical function, see Postgres Window Functions

SELECT * FROM (
                SELECT          
                scheme1.A.some_attr, 
                scheme2.B.some_attr,                     
                rank() OVER (PARTITION BY B.ID ORDER BY scheme2.B.cnt, scheme2.B.another_attr DESC) as rnk
                FROM 
                                scheme1.A
                INNER JOIN
                                scheme2.B
                ON
                                scheme2.B.id = scheme2.A.id
) A WHERE rnk = 1;

Upvotes: 1

Related Questions