Greenhorn
Greenhorn

Reputation: 411

Rewriting subquery as JOIN?

UPDATE: Thanks to Sifu Bill's advice I have amended the SQL query. Now it returns the correct number of distinct assets (five).

Is it possible to rewrite the following subquery as a JOIN?

SELECT TOP 100 PERCENT Asset_ID, work_order_id, status_id,
downtime_hours, date_completed
FROM dbo.mtvw_wo_reason1
WHERE (Asset_ID IN
(SELECT TOP 5 Asset_ID from mtvw_wo_reason1
WHERE (Date_Completed BETWEEN '1-Oct-2009' AND '31-Oct-2009')
GROUP BY Asset_ID
ORDER BY SUM(Downtime_hours) DESC))

The output follows:

Asset_ID work_order_id status_id downtime_hours Date_Completed

SFM2019 2009-724    Completed   .5000   2009-09-01 17:55:09.000
SFM2019 2009-761    Completed   .5000   2009-09-15 09:03:39.000
SFM2019 2009-828    Completed   8.0000  2009-10-02 00:00:00.000
SFM2043 2009-683    Completed   .5000   2009-08-14 00:00:00.000
SFM2043 2009-734    Completed   1.0000  2009-09-05 00:54:33.000
SFM2043 2009-741    Completed   1.0000  2009-09-08 17:05:09.000
SFM2043 2009-756    Completed   .5000   2009-09-14 07:56:41.000
SFM2043 2009-792    Completed   1.0000  2009-09-22 00:00:00.000
SFM2043 2009-826    Completed   8.0000  2009-10-03 02:00:19.130
SFM2043 2009-983    Completed   1.0000  2009-10-30 00:00:00.000
SFM2045 2009-794    Completed   17.0000 2009-09-24 00:00:00.000
SFM2045 2009-808    Completed   1.0000  2009-09-26 16:01:25.850
SFM2045 2009-811    Completed   3.0000  2009-09-27 09:47:45.543
SFM2045 2009-816    Completed   24.0000 2009-09-30 15:14:35.000
SFM2045 2009-858    Completed   3.0000  2009-10-12 00:00:00.000
SFM2045 2009-861    Completed   .5000   2009-10-13 01:11:50.900
SFM2045 2009-975    Completed   1.0000  2009-10-28 00:00:00.000
SFM2045 2009-984    Completed   3.0000  2009-10-30 00:00:00.000
SFM2088 2009-777    Completed   .5000   2009-10-01 12:49:57.000
SFM2088 2009-853    Completed   .5000   2009-10-09 00:00:00.000
SFM2088 2009-855    Completed   10.0000 2009-10-09 00:00:00.000
SFM2200 2009-753    Completed   8.0000  2009-09-11 00:00:00.000
SFM2200 2009-830    Completed   8.0000  2009-10-09 00:00:00.000

In essence the query needs to:

Return the top 5 assets with the highest cumulative downtime for the month of October, and a breakdown of the associated workorders for these assets

The query has to be rewritten in such a way that does not require the the existence of Date_Completed criteria in the inner query (is this even possible?). Using Sifu Bill's answer yielded only 11 rows (there should be 23):

Output from Sifu Bill's query:

Asset_ID work_order_id status_id downtime_hours Date_Completed

SFM2045 2009-858    Completed   3.0000  2009-10-12 00:00:00.000
SFM2045 2009-861    Completed   .5000   2009-10-13 01:11:50.900
SFM2045 2009-975    Completed   1.0000  2009-10-28 00:00:00.000
SFM2045 2009-984    Completed   3.0000  2009-10-30 00:00:00.000
SFM2200 2009-830    Completed   8.0000  2009-10-09 00:00:00.000
SFM2043 2009-826    Completed   8.0000  2009-10-03 02:00:19.130
SFM2043 2009-983    Completed   1.0000  2009-10-30 00:00:00.000
SFM2088 2009-777    Completed   .5000   2009-10-01 12:49:57.000
SFM2088 2009-853    Completed   .5000   2009-10-09 00:00:00.000
SFM2088 2009-855    Completed   10.0000 2009-10-09 00:00:00.000
SFM2019 2009-828    Completed   8.0000  2009-10-02 00:00:00.000

Thank you for any insight.

Upvotes: 1

Views: 7289

Answers (2)

Bill Karwin
Bill Karwin

Reputation: 563021

It's hard to guess what you intend this query to do. I suggest that when you ask how to fix a query that isn't doing what you want, you include in your question a clear statement about what the purpose of the query is. Otherwise people who answer can't infer from a wrong query what would be the right one.

Also it's helpful if you include the definition of the table. The best way to show this is the Data Definition Language (i.e. CREATE TABLE) that you used to define the table.

I'm assuming this statement describes your desired query result:

"All rows during October for each Asset_ID's that have the most cumulative downtime during that month."

You can get the assets that have the most downtime during October like you were doing in the subquery:

SELECT TOP 5 Asset_ID 
FROM mtvw_wo_reason1
WHERE Date_Completed BETWEEN '2009-10-1' and '2009-10-31'
GROUP  BY Asset_ID
ORDER BY SUM(Downtime_hours) DESC

Notice that this query could be run standalone. This is a clue that if you use it as a subquery, it's non-correlated. That is, it contains no reference to values in an outer query.

You don't want to add Date_Completed to the GROUP BY clause as @astander showed. You want groups to be per asset alone, so you can get accurate SUM() per asset.

However, the select-list of this table doesn't include the other columns from the table, and the rows are reduced to a single row per Asset_ID by the GROUP BY. So it doesn't give all the information you need. Therefore you need to run it as a subquery.

SELECT m.Asset_ID, m.work_order_id, m.status_id, m.downtime_hours, m.Date_Completed
FROM dbo.mtvw_wo_reason1 m
JOIN (SELECT TOP 5 Asset_ID 
    FROM mtvw_wo_reason1
    WHERE Date_Completed BETWEEN '2009-10-1' and '2009-10-31'
    GROUP  BY Asset_ID
    ORDER BY SUM(Downtime_hours) DESC) sub
  ON (m.Asset_ID = sub.Asset_ID)
WHERE m.Date_Completed BETWEEN '2009-10-1' and '2009-10-31';

You do need to restrict by Date_Completed in both the subquery and the outer query.

PS: From what I've read, the only valid use of TOP 100 PERCENT is when you're defining a VIEW with an ORDER BY clause at the end.

Upvotes: 2

Adriaan Stander
Adriaan Stander

Reputation: 166626

You can try

SELECT  TOP 100 PERCENT 
        Asset_ID, 
        work_order_id, 
        status_id, 
        downtime_hours 
FROM    dbo.mtvw_wo_reason1 INNER JOIN
        (
            SELECT  TOP 5 
                    Asset_ID 
            from    mtvw_wo_reason1 
            GROUP   BY Asset_ID, Date_Completed 
            HAVING  Date_Completed BETWEEN '1-Oct-2009' and '31-Oct-2009' 
            ORDER BY    SUM(Downtime_hours) DESC
        ) sub ON dbo.mtvw_wo_reason1.Asset_ID = sub.Asset_ID

EDIT for comment:

SELECT  TOP 100 PERCENT         
        Asset_ID,         
        work_order_id,         
        status_id,         
        downtime_hours ,
        sub.Date_Completed --added
FROM    dbo.mtvw_wo_reason1 INNER JOIN        
        (                
            SELECT  TOP 5                                 
                    Asset_ID,
                    Date_Completed --added
            from    mtvw_wo_reason1                 
            GROUP   BY Asset_ID, Date_Completed                 
            HAVING  Date_Completed BETWEEN '1-Oct-2009' and '31-Oct-2009'                 
            ORDER BY SUM(Downtime_hours) DESC        
        ) sub ON dbo.mtvw_wo_reason1.Asset_ID = sub.Asset_ID

Upvotes: 3

Related Questions