kkris77
kkris77

Reputation: 117

SQL WHERE MAX(date) within GROUP BY

I've created a SQL Fiddle (http://sqlfiddle.com/#!9/e0536/1) with similar data I've got at work (there are actually more columns in the table). Table contains employment details. An employee can have more than one record in the table (couple of fixed-term contracts) as well as different employee_ID (change from 'tixxxxx' into 'pixxxxx'). The PESEL number is the unique personal identification number. ID for past contract can be higher than for actual one as the table is populated with data every day as an extract based on HR data.

What I need to get is:

It's been some time since I used SQL every day so I'd appreciate any help here. I was thinkig of some nested queries with group by clause, but I never understood well correlated subqueries.

Expected result:

ID          Employee_ID PESEL       StartDate  ExpirationDate
----------- ----------- ----------- ---------- --------------
1           pi39764     1111        2014-01-01 2016-06-01
2           pi12986     1234        2015-12-01 2099-12-31
5           pi12345     4321        2015-02-01 2099-12-31

where the startdate is the very first startdate.

Upvotes: 1

Views: 1435

Answers (4)

Adam Silenko
Adam Silenko

Reputation: 3108

probably you looking for query like this:

SELECT e.*, CASE WHEN actual = StartDate THEN 1 ELSE 0 END AS actual_e, first_startdate
FROM Employees AS e
INNER JOIN(SELECT PESEL, MIN(startdate) AS first_startdate , MAX(startdate) AS actual 
  FROM Employees AS e
  GROUP BY PESEL) AS g
  ON g.PESEL = e.PESEL

EDIT:
to get actual Employee_ID on every row use sub query:

, CASE WHEN actual = StartDate THEN null 
ELSE (SELECT max(a.Employee_ID) FROM Employees AS a WHERE a.PESEL = e.PESEL and a.StartDate = actual)
END AS actual_Employee_ID

EDIT:
in Fidde you write MySQL query, for sql server (tag) it is much simplest:

SELECT e.*
, LEAD(Employee_ID) OVER (PARTITION BY PESEL ORDER BY startdate) actual_Employee_ID
, MIN(startdate) OVER (PARTITION BY PESEL) first_startdate
FROM Employees AS e

EDIT (result with last ti):
for all data:

SELECT e.*
, first_startdate
, last_t_startdate
, last_startdate
, (SELECT max(employee_ID) FROM dbo.Employees t WHERE startdate = last_t_startdate AND PESEL = e.PESEL) AS last_t_id
, (SELECT max(employee_ID) FROM dbo.Employees t WHERE startdate = last_startdate AND PESEL = e.PESEL) AS last_id
FROM dbo.Employees AS e
OUTER APPLY (
  SELECT Min(startdate) AS first_startdate
  , Max(Case When employee_ID LIKE 'ti%' Then startdate End) last_t_startdate
  , Max(startdate) AS last_startdate    
  FROM dbo.Employees
  WHERE PESEL = e.PESEL
  --GROUP BY PESEL
  ) AS g 

output:

ID  Employee_ID PESEL   StartDate   ExpirationDate  first_startdate last_t_startdate    last_startdate  last_t_id   last_id
1   pi39764 1111    2015-01-01  2016-06-01  2014-01-01  2014-01-01  2015-01-01  ti00001 pi39764
2   pi12986 1234    2015-12-01  2099-12-31  2015-12-01  NULL    2015-12-01  NULL    pi12986
3   ti00001 1111    2014-01-01  2014-12-31  2014-01-01  2014-01-01  2015-01-01  ti00001 pi39764
4   pi12345 4321    2015-02-01  2015-06-30  2015-02-01  NULL    2016-01-01  NULL    pi12345
5   pi12345 4321    2016-01-01  2099-12-31  2015-02-01  NULL    2016-01-01  NULL    pi12345
6   pi12345 4321    2015-07-01  2015-12-31  2015-02-01  NULL    2016-01-01  NULL    pi12345

for grouped data:

SELECT pesel
, first_startdate
, last_t_startdate
, last_startdate
, (SELECT max(employee_ID) FROM dbo.Employees t WHERE startdate = last_t_startdate AND PESEL = g.PESEL) last_t_id
, (SELECT max(employee_ID) FROM dbo.Employees t WHERE startdate = last_startdate AND PESEL = g.PESEL) last_id
FROM (
  SELECT PESEL
  , Min(startdate) AS first_startdate
  , Max(Case When employee_ID LIKE 'ti%' Then startdate End) AS last_t_startdate 
  , Max(startdate) AS last_startdate
  FROM dbo.Employees
  GROUP BY PESEL) AS g

output:

pesel   first_startdate last_t_startdate    last_startdate  last_t_id   last_id
1111    2014-01-01  2014-01-01  2015-01-01  ti00001 pi39764
1234    2015-12-01  NULL    2015-12-01  NULL    pi12986
4321    2015-02-01  NULL    2016-01-01  NULL    pi12345

Upvotes: 1

Guido
Guido

Reputation: 926

SELECT e1.employee_id, e.pesel, e.maxdate
FROM (
      SELECT pesel, MAX(expirationdate) as maxdate
      FROM employees
      GROUP BY pesel
) e
INNER JOIN employees e1
ON e.pesel = e1.pesel AND e.maxdate = e1.expirationdate

Output:

| Employee_ID | pesel |                    maxdate |
|-------------|-------|----------------------------|
|     pi39764 |  1111 |     June, 01 2016 00:00:00 |
|     pi12986 |  1234 | December, 31 2099 00:00:00 |
|     pi12345 |  4321 | December, 31 2099 00:00:00 |

To find the first date and the last date for each PESEL, use:

SELECT e1.employee_id, e.pesel, e.startdate, e.enddate
FROM (
      SELECT pesel, 
           MIN(startdate) as startdate, 
           MAX(expirationdate) as enddate
      FROM employees
      GROUP BY pesel
) e
INNER JOIN employees e1
ON e.pesel = e1.pesel AND e.enddate = e1.expirationdate

Upvotes: 2

Adam Silenko
Adam Silenko

Reputation: 3108

so try this:

SELECT ID, Employee_ID, PESEL, first_startdate AS StartDate, ExpirationDate
FROM (
    SELECT e.*
    , LEAD (Employee_ID) OVER (PARTITION BY PESEL ORDER BY startdate) actual_Employee_ID
    , MIN(startdate) OVER (PARTITION BY PESEL) first_startdate
    FROM Employees AS e) AS x
WHERE actual_Employee_ID is null

then you get:

| ID | Employee_ID | PESEL |  StartDate | ExpirationDate |
|----|-------------|-------|------------|----------------|
|  1 |     pi39764 |  1111 | 2014-01-01 |     2016-06-01 |
|  2 |     pi12986 |  1234 | 2015-12-01 |     2099-12-31 |
|  5 |     pi12345 |  4321 | 2015-02-01 |     2099-12-31 |

http://sqlfiddle.com/#!3/e0536/7

Upvotes: 1

Mihai
Mihai

Reputation: 26784

Use the SUBTRING_INDEX with GROUP_CONCAT trick to get all the columns

select SUBSTRING_INDEX(GROUP_CONCAT(Employee_ID ORDER BY StartDate DESC),',',1),
PESEL,MIN(StartDate),MAX(ExpirationDate)
from employees
GROUP BY PESEL
ORDER BY 2

Upvotes: 1

Related Questions