leinho
leinho

Reputation: 59

The best SQL Code to get this Output

Simply put, I have these tables.

Planning: 
  Date | Machine | Product | PlanningParts

OKParts: 
  Date | Machine | Product | OKParts

Scrap: 
  Date | Machine | Product | ScrapParts

Trials: 
  Date | Machine | Product | TrialParts

Breakdowns: 
  Date | Machine | Product | Minutes

These tables can have two or more records for a same [Date | Machine | Product], they are not unique columns. What is the best SQL Code(SQLite or SQL Server) to get this output?

Output:
  Date | Machine | Product | Planning | OKParts | ScrapParts | TrialParts | Minutes

Edit: In the output, i need one line per [Date,Machine,Product] and the rest of the columns have to be the SUM (I don't speak english, Sorry)

Eidted: Example: (I would not put the table "Trials" to make it shorter)

Planning:
     Date    | Machine | Product | PlanningParts
     1/6/12  |  Blower | A001    | 100
     2/6/12  |  Blower | A002    | 100
     2/6/12  | Assembly| B001    | 50

OKParts:
     Date   | Machine | Product  | OKParts
     1/6/12 | Blower  | A001     | 50
     1/6/12 | Blower  | A001     | 20
     1/6/12 | Blower  | A002     | 100

Scrap:
     Date   | Machine | Product  | ScrapParts | Reason
     1/6/12 | Blower  | A001     | 5          | Low Weight
     1/6/12 | Blower  | A001     | 3          | High Weight
     2/6/12 | Assembly| B001     | 4          | Bad Cut

Breakdowns
     Date   | Machine | Product  | Minutes    | Reason
     1/6/12 | Blower  | A001     | 100        | Manteinance
     1/6/12 | Blower  | A001     | 20         | Manteinance
     2/6/12 | Assembly| B001     | 100        | Quality approval


   OUTPUT:
         Date   | Machine | Product  | Planning | OKParts | ScrapParts | Breakdowns
         1/6/12 | Blower  | A001     | 100      | 70      | 8          | 120
         1/6/12 | Blower  | A002     | 100      | 100     | 0          | 0
         2/6/12 | Assembly| B001     | 50       | 0       | 4          | 100

Upvotes: 0

Views: 161

Answers (2)

ErikE
ErikE

Reputation: 50241

I'd be interested to see if SUMming after the UNION performs as well as this query that SUMs first:

WITH Vals AS (
   SELECT Date, Machine, Product, 'PlanningParts' Which, Sum(PlanningParts) Value FROM Planning GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'OKParts', Sum(OKParts) FROM OKParts GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'ScrapParts', Sum(ScrapParts) FROM Scrap GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'TrialParts', Sum(TrialParts) FROM Trials GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'Minutes', Sum(Minutes) FROM Breakdowns GROUP BY Date, Machine, Product
)
SELECT *
FROM Vals
PIVOT (Max(Value) FOR Which IN (PlanningParts, OKParts, ScrapParts, TrialParts, Minutes)) P;

Yeah, it's painful, but that's what you get from the given database design. Refactoring these 5 tables into one table, with views replacing these tables having instead-of triggers could be viable.

Note: PIVOT is for SQL Server 2005 and up. But you don't even need PIVOT:

WITH Vals AS (
   SELECT Date, Machine, Product, 'PlanningParts' Which, Sum(PlanningParts) Value FROM Planning GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'OKParts', Sum(OKParts) FROM OKParts GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'ScrapParts', Sum(ScrapParts) FROM Scrap GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'TrialParts', Sum(TrialParts) FROM Trials GROUP BY Date, Machine, Product
   UNION ALL SELECT Date, Machine, Product, 'Minutes', Sum(Minutes) FROM Breakdowns GROUP BY Date, Machine, Product
)
SELECT
   Date, Machine, Product,
   Sum(CASE Which WHEN 'PlanningParts' THEN Value END) PlanningParts,
   Sum(CASE Which WHEN 'OKParts' THEN Value END) OKParts,
   Sum(CASE Which WHEN 'ScrapParts' THEN Value END) ScrapParts,
   Sum(CASE Which WHEN 'TrialParts' THEN Value END) TrialParts,
   Sum(CASE Which WHEN 'Minutes' THEN Value END) Minutes
FROM Vals;

Switching the text Which values to integers will probably yield a speed improvement.

In regards to my comment/question about a table with a distinct list of the Date, Machine, Product values that are in your tables, this query would provide such a list. This won't perform well, but it should give you the idea.

WITH DistinctKeys AS ( --wishing we had this as a real table
   SELECT Date, Machine, Product FROM Planning
   UNION SELECT Date, Machine, Product FROM OKParts
   UNION SELECT Date, Machine, Product FROM Scrap
   UNION SELECT Date, Machine, Product FROM Trials
   UNION SELECT Date, Machine, Product FROM Breakdown
) -- because then we could do this:
SELECT
   K.Date, K.Machine, K.Part,
   (SELECT Sum(PlanningParts) FROM Planning X ON EXISTS (SELECT K.* INTERSECT SELECT X.Date, X.Machine, X.Product) PlanningParts
   (SELECT Sum(OKParts) FROM OKParts X ON EXISTS (SELECT K.* INTERSECT SELECT X.Date, X.Machine, X.Product) OKParts
   (SELECT Sum(ScrapParts) FROM Scrap X ON EXISTS (SELECT K.* INTERSECT SELECT X.Date, X.Machine, X.Product) Scrap
   (SELECT Sum(TrialParts) FROM Trials X ON EXISTS (SELECT K.* INTERSECT SELECT X.Date, X.Machine, X.Product) TrialParts
   (SELECT Sum(Minutes) FROM Breakdown X ON EXISTS (SELECT K.* INTERSECT SELECT X.Date, X.Machine, X.Product) Minutes
FROM
   DistinctKeys K;

But frankly, the JOINs are not going to perform as well as the UNIONs previously given in my and @jaypeagi's answers.

One more thing: do NOT assume what the performance is going to be of a query. Even experts check execution plans and collect real IO and CPU statistics to determine this information. Your worries about NULLs reducing performance are likely completely unfounded.

If you have good indexes on your tables on [Date, Machine, Product] then you will probably get reasonably good performance out of the UNION methods.

Upvotes: 1

jaypeagi
jaypeagi

Reputation: 3141

Create a CTE unioning all the tables into one, then using the SUM aggrigate function, add up the columns for each Date, Machine, Product group. A bit like this (not tested):

WITH AllParts AS (
SELECT Date, Machine, Product, PlanningParts, NULL AS OKParts, NULL AS ScrapParts, NULL AS TrialParts, NULL AS Breakdowns
FROM Planning
UNION ALL
SELECT Date, Machine, Product,NULL AS PlanningParts, OKParts, NULL AS ScrapParts, NULL AS TrialParts, NULL AS Breakdowns
FROM OKParts
UNION ALL
SELECT Date, Machine, Product,NULL AS PlanningParts, NULL AS OKParts, ScrapParts, NULL AS TrialParts, NULL AS Breakdowns
FROM Scrap
UNION ALL
SELECT Date, Machine, Product,NULL AS PlanningParts, NULL AS OKParts, NULL AS ScrapParts, TrialParts, NULL AS Breakdowns
FROM Trials
UNION ALL
SELECT Date, Machine, Product,NULL AS PlanningParts, NULL AS OKParts, NULL AS ScrapParts, TrialParts, Breakdowns
FROM BreakDowns
)
SELECT
Date, Machine, Product, SUM(OKParts) AS OKParts, SUM(ScrapParts) AS ScrapParts, SUM(TrialParts) AS TrialParts, SUM(BreakDowns) AS Breakdowns
FROM AllParts
GROUP BY Date, Machine, Product

Upvotes: 3

Related Questions