Reputation: 40169
mysql> describe jobs;
+--------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+----------------+
| job_id | int(11) | NO | PRI | NULL | auto_increment |
| candidate_id | int(11) | NO | MUL | NULL | |
| company_id | int(11) | NO | MUL | NULL | |
| start_date | date | NO | MUL | NULL | |
| end_date | date | NO | MUL | NULL | |
+--------------+---------+------+-----+---------+----------------+
5 rows in set (0.01 sec)
Each candidate can have multiple jobs. I want to find the latest job for each candidate (based on start date, because end date can be 0000-00-00), and check (in PHP) if the end_Date is not 0000-00-00.
(if your last end date is not 0000-00-00, then you are currently unemployed, which is what I am looking for; I don't know how to do this in my query, so will do it in PHP).
The query SELECT candidate_id, end_Date FROM jobs ORDER BY candidate_id , start_date DESC
gets me halfway there.
+--------------+------------+
| candidate_id | end_Date |
+--------------+------------+
| 1 | 2019-08-31 |
| 1 | 2019-01-31 |
| 1 | 2019-05-31 |
| 2 | 0000-00-00 |
| 2 | 2018-02-28 |
| 2 | 2017-05-31 |
| 2 | 2016-09-30 |
| 3 | 0000-00-00 |
| 3 | 2019-05-31 |
| 4 | 2019-04-30 |
| 4 | 2019-09-30 |
(How) can I get only the first entry (row with the most recent start_date
) for each candidate Id? And can I get only those where the end date is not 0000-00-00?
(Oops, it looks like my ordering by end_date is not working)
Upvotes: 2
Views: 68
Reputation: 1270181
You can do this with aggregation:
select candidate_id,
(case when sum(end_date = '0000-00-00') > 0
then '0000-00-00'
else max(end_date)
end) as enddate
from jobs j
group by candidate_id;
Or another method:
select j.*
from jobs j
where j.end_date = '0000-00-00' or
(not exists (select 1
from jobs j2
where j2.candidate_id = j.candidate_id and
(j2.end_date = '0000-00-00' or
j2.end_date > j.end_date
)
)
);
Or even:
select j.*
from jobs j
where j.job_id = (select j2.job_id
from jobs j2
where j2.candidate_id = j.candidate_id
order by (j2.end_date = '0000-00-00') desc,
j2.end_date desc
);
Upvotes: 0
Reputation: 1606
Option without sub-query:
SELECT
j.*
FROM
jobs AS j
LEFT JOIN jobs AS j2 ON (
j2.candidate_id = j.candidate_id
AND j2.start_date > j.start.date
)
WHERE
j2.candidate_id IS NULL
You'd like to have composite index (candidate_id, start_date) to optimize the query.
Upvotes: 1
Reputation: 133380
You could use a join on max start_date group by candidate
select *
from jobs j
inner join (
select candidate_id , max(start_date) max_start_date
from jobs
group by candidate_id
) t on t.candidate_id = j.candidate_id
and t.max_start_date = j.start_date
Upvotes: 1
Reputation: 222542
You can filter with a correlated subquery:
select j.*
from jobs j
where j.start_date = (
select max(start_date)
from jobs j1
where j1.candidate_id = j.candidate_id and j1.end_date <> '0000-00-00'
)
The subquery returns the greatest start_date
whose end_date
is not null
for the current candidate.
Another typical method to solve this top 1 per group problem is to use an anti
-left join:
select j.*
from jobs j
left join jobs j1
on j1.candidate_id = j.candidate_id
and j1.start_date > j.start_date
and j1.end_date is not null
where
j.end_date is not null
and j1.job_id is null
This phrases as: give me the records with a non-null
end_date
for which no other record exists with the same candidate_id
, a greater start_date
and a non-null
end_date
.
Upvotes: 2