user2079024
user2079024

Reputation: 161

How to use MAX from single column from single table while querying multiple columns from multiple tables

So I got a request for some data from someone in my company. This data request requires SQL query of multiple columns from multiple tables BUT can only be the most recent incident from a column from one of those tables. Here is the kicker...each table has a single column that it shares with another table. So to get this "report" I have to do it piece by piece by piece.

Here is how my initial query looked before I realized I only need the most recent update to one of the tables:

SELECT a.description  AS "Description", 
       a.pricing      AS "Price", 
       b.id           AS "ID", 
       c.descriptionb AS "DescriptionB", 
       c.date         AS "date", 
       d.descriptionc AS "DescriptionC" 
FROM   database.table1 a, 
       database.table2 b, 
       database.table3 c, 
       database.table4 d 
WHERE  a.description = b.descriptive_info 
       AND b.id = c.comp_id 
       AND c.descriptionb = d.long_description
       AND d.id_for_a = a.id
       AND a.company IN ( '000', '001', '002', '003', '004' ) 
       AND b.expdate >= Now()

I realized that the "c.date" above needs to only display the most recent date for each unique ID/DescriptionC.

Here is an example result of the initial query:

Description|Price   |ID       |DescriptionB   |date     |DescriptionC
---------------------------------------------------------------------
Computer   |300     |554      |5% Off         |3/2/2010 |Includes CPU
Computer   |300     |554      |5% Off         |3/2/2010 |Includes DOG
Computer   |300     |554      |5% Off         |3/2/2010 |Includes CAT 
Computer   |300     |554      |9% Off         |4/3/2011 |Includes CPU
Computer   |300     |554      |9% Off         |4/3/2011 |Includes DOG
Computer   |300     |554      |9% Off         |4/3/2011 |Includes CAT 
Computer   |300     |554      |7% Off         |9/1/2019 |Includes CPU
Computer   |300     |554      |7% Off         |9/1/2019 |Includes DOG
Computer   |300     |554      |7% Off         |9/1/2019 |Includes CAT 
Printer    |75      |801      |3% Off         |6/3/2012 |Includes DOS
Printer    |75      |801      |3% Off         |6/3/2012 |Includes PIG
Printer    |75      |801      |3% Off         |6/3/2012 |Includes RAT
Printer    |75      |801      |9% Off         |8/3/2013 |Includes DOS
Printer    |75      |801      |9% Off         |8/3/2013 |Includes PIG
Printer    |75      |801      |9% Off         |8/3/2013 |Includes RAT
Printer    |75      |801      |1% Off         |1/3/2019 |Includes DOS
Printer    |75      |801      |1% Off         |1/3/2019 |Includes PIG
Printer    |75      |801      |1% Off         |1/3/2019 |Includes RAT

Here is the result of the query by Laurenz below:

Description|Price   |ID       |DescriptionB   |date     |DescriptionC
---------------------------------------------------------------------
Computer   |300     |554      |7% Off         |9/1/2019 |Includes CAT

...close but not quite there yet.

Desired Result:

Description|Price   |ID       |DescriptionB   |date     |DescriptionC
---------------------------------------------------------------------
Computer   |300     |554      |7% Off         |9/1/2019 |Includes CPU
Computer   |300     |554      |7% Off         |9/1/2019 |Includes DOG
Computer   |300     |554      |7% Off         |9/1/2019 |Includes CAT 
Printer    |75      |801      |1% Off         |1/3/2019 |Includes DOS
Printer    |75      |801      |1% Off         |1/3/2019 |Includes PIG
Printer    |75      |801      |1% Off         |1/3/2019 |Includes RAT

As you can see there are multiple dates for the "same product", duplicate products with varying descriptions, etc. I basically only want the rows with the most recent "date" for each unique ID/DescriptionC. Hopefully this is a little easier to understand than my original post.

BTW these are simplified examples as I don't want to get in trouble by my company but the query and concepts are the same. If you can imagine multiple products with numerous instances of each product multiple times you can imagine how big the data set could get. I only care about the most recent instance of each unique ID/DescriptionC.

Upvotes: 1

Views: 155

Answers (3)

user2079024
user2079024

Reputation: 161

Thank you so much everyone. The answer from Hal McGee was basically correct! I just had to tweak a few things. Here is an example of the full query I ended up using that functioned as I was needing it to:

SELECT *
FROM (
    SELECT 
        a.description  AS "Description",
        a.pricing      AS "Price", 
        b.string       AS "String", 
        c.description  AS "Description", 
        c.date         AS "date", 
        d.descriptionb AS "DescriptionB",
        ROW_NUMBER() OVER (PARTITION BY d.descriptionc, b.id ORDER BY c.date DESC ) AS rn
    FROM
       database.table1 a 
       INNER JOIN database.table2 b ON a.id = b.table1_id 
       INNER JOIN database.table3 c ON b.element = c.table2_element AND b.expdate >= Now()
       INNER JOIN database.table4 d ON c.value = d.table3_value 
    WHERE 
        a.company IN ( '000', '001', '002', '003', '004' ) 
) x WHERE rn = 1;

Upvotes: 1

Hal McGee
Hal McGee

Reputation: 21

The following sql can be used to solve the question:

SELECT *
FROM (
    SELECT 
        a.description  AS "Description",

        a.pricing      AS "Price", 
        b.string       AS "String", 
        c.description  AS "Description", 
        c.date         AS "date", 
        d.descriptionb AS "DescriptionB",
        ROW_NUMBER() OVER (ORDER BY c.date DESC PARTITION BY B.ID ) AS rn
    FROM
       database.table1 a 
       INNER JOIN database.table2 b ON a.id = b.table1_id 
       INNER JOIN database.table3 c ON b.element = c.table2_element AND b.expdate >= Now()
       INNER JOIN database.table4 d ON c.value = d.table3_value 
    WHERE 
        a.company IN ( '000', '001', '002', '003', '004' ) 
) x WHERE rn = 1;

Upvotes: 2

GMB
GMB

Reputation: 222572

I understand that, out of the results from the current query, you just want to select the one that has the maximum value on c.date.

One solution would be to turn the existing query to a subquery, and use ROW_NUMBER() to rank the records by descending c.date. Then, the outer query can just filter on the highest ranked record.

Query:

SELECT *
FROM (
    SELECT 
        a.description  AS "Description", 
        a.pricing      AS "Price", 
        b.string       AS "String", 
        c.description  AS "Description", 
        c.date         AS "date", 
        d.descriptionb AS "DescriptionB",
        ROW_NUMBER() OVER (ORDER BY c.date DESC) AS rn
    FROM
       database.table1 a 
       INNER JOIN database.table2 b ON a.id = b.table1_id 
       INNER JOIN database.table3 c ON b.element = c.table2_element AND b.expdate >= Now()
       INNER JOIN database.table4 d ON c.value = d.table3_value 
    WHERE 
        a.company IN ( '000', '001', '002', '003', '004' ) 
) x WHERE rn = 1;

PS:

  • always prefer explicit joins instead of old-school, implicit joins; I changed the initial query accordingly

  • if you need the maximum date by partition (which is not obvious to tell without seeing sample data), then you simply need to add a PARTITION BY clause to the ROW_NUMBER() function.

Upvotes: 4

Related Questions