Wes Doyle
Wes Doyle

Reputation: 2287

SQL Join, Selecting Earliest instance and Preserving Nulls

I have two tables containing the following data:

(File)
    +----+------+...
    | Id | File |
    +----+------+...
    |  1 | Fizz |
    |  2 | Buzz |
    +----+------+...

(Work)    
    +----+------+-----------+-----------+...
    | Id | File |   Step    |   Date    |
    +----+------+-----------+-----------+...
    |  1 | Fizz | Created   | 1/1/2015  |
    |  2 | Fizz | StartWork | 1/2/2015  |
    |  3 | Fizz | WorkDone  | 1/3/2015  |
    |  4 | Buzz | StartWork | 2/16/2015 |
    |  5 | Buzz | WorkDone  | 2/17/2015 |
    |  6 | Buzz | WorkDone  | 2/18/2015 |
    |  7 | Buzz | Closed    | 2/19/2015 |
    +----+------+-----------+-----------+...

I would like to generate the following result:

(Query Result)
    +------+-----------+-----------+-----------+-----------+
    | File |  Created  | StartWork | WorkDone  |  Closed   |
    +------+-----------+-----------+-----------+-----------+
    | Fizz | 1/1/2015  | 1/2/2015  | 1/3/2015  | NULL      |
    | Buzz | NULL      | 2/16/2015 | 2/17/2015 | 2/19/2015 |
    +------+-----------+-----------+-----------+-----------+

The query should return a single row for any File, with the earliest instance of a given Step for that File from the Work table. For example, 'WorkDone' occurs on two dates for the Buzz file, but I would like the Joined table to select only the WorkDone on 2/17/2015, which is the earliest time WorkDone occurred.

Also, for Steps which do not exist for a given File, I would like to show NULL.

Here is the query that I have been working on, but I'm afraid that I'm on the wrong track. I am not capturing NULL entries, and I believe I am using MIN incorrectly.

SELECT DISTINCT
f.File, 
MIN(workCreated.Date) AS 'Created', 
MIN(workStart.Date) AS 'StartWork', 
MIN(workDone.Date) AS 'WorkDone',
MIN(workClosed.Date) AS 'Closed'
FROM File f
LEFT JOIN Work workCreated ON f.File = workCreated.File
LEFT JOIN Work workStart ON f.File = workStart.File
LEFT JOIN Work workDone ON f.File = workDone.File
LEFT JOIN Work workClosed ON f.File = workClosed.File
WHERE
workCreated.Step = 'Created' AND
workStart.Step = 'StartWork' AND
workDone.Step = 'WorkDone' AND
workClosed.Step = 'Closed'
GROUP BY f.File

Upvotes: 0

Views: 46

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

One method is conditional aggregation:

select w.file,
       min(case when step = 'Created' then date end) as Created,
       min(case when step = 'StartWork' then date end) as StartWork,
       min(case when step = 'WorkDone' then date end) as WorkDone,
       min(case when step = 'Closed' then date end) as Closed
from Work w
group by w.file;

The structure of your data suggests that file should probably not be in the Work table. The id should be. Then you can join to the other table.

Upvotes: 0

Emacs User
Emacs User

Reputation: 1475

You'll have to ignore the IDs for joining, instead use File and Step. Then you can aggregate on the minimum Date. Case logic can be used, but that becomes a bit more complex. Strictly speaking you don't need to use the File table.

select f.File,
Created=min(c.Date),
StartWork=min(s.Date),
WorkDone=min(d.Date),
Closed=min(l.Date)
from File as f
left join Work as c on f.File=c.File and c.Step='Created'
left join Work as s on f.File=s.File and s.Step='StartWork'
left join Work as d on f.File=d.File and d.Step='WorkDone'
left join Work as l on f.File=l.File and l.Step='Closed'
group by f.File

Upvotes: 1

Related Questions