Reputation: 312
I'm trying to pull this data from the same table using a join. Basically I need to create a timesheet report to see if an employee has submitted normal billable hours and vacation hours on the same date. Billable hours are type 1,2,5 and 6 and vacation hours are type 4. Here is what I have so far.
SELECT
a.attendant_id,
a.date,
a.start_time,
a.end_time,
a.hours AS HoursBilled,
a.type AS TypeBilled,
t.start_time,
t.end_time,
t.hours AS HoursVacation
FROM
timesheet_lines AS a
INNER JOIN (
SELECT
b.date,
b.attendant_id,
b.hours,
b.start_time,
b.end_time
FROM
timesheet_lines AS b
WHERE
b.date >= '2014-10-01'
AND b.date <= '2014-12-31'
AND b.type = 4
) AS t ON a.attendant_id = t.attendant_id
WHERE
a.date = t.date
AND a.type IN (1, 2, 5, 6)
I am getting the data that I need, the only problem is I am getting duplicates like so:
atten_id date start_time end_time HB TB s_t1 e_t1 HV
4584 2014-10-03 10:00:00 12:00:00 2 1 05:30:00 06:30:00 1
4584 2014-10-03 10:00:00 12:00:00 2 1 18:00:00 21:00:00 3
6139 2014-10-04 14:00:00 17:00:00 3 2 09:00:00 13:30:00 4.5
6842 2014-10-06 00:00:00 08:00:00 8 1 17:00:00 20:00:00 3
I abbreviated the field headers for the room. The first two rows are duplicated up to the 6th column. The last 3 columns contain good data. The reverse is also possible where the last 3 columns contain duplicates and the first 6 are good data.
Is there any way to just fill the duplicated data with "NULL"? I was going to try to filter out the duplicates on the PHP side but there has to be a better solution. Any help would be greatly appreciated.
Upvotes: 1
Views: 1163
Reputation: 21
I've always found the process of joining tables with themselves to be fraught with unexpected and inconvenient complexity. Here's different approach you might consider:
SELECT CONCAT(attendant_id,"|",date) AS query_key,
GROUP_CONCAT(DISTINCT IF(`type` in (1,2,5,6),`type`,NULL) AS n_hours,
GROUP_CONCAT(DISTINCT IF(`type` in (1,2,5,6),start_time,NULL) AS n_start,
GROUP_CONCAT(DISTINCT IF(`type` in (1,2,5,6),end_time,NULL) AS n_end,
GROUP_CONCAT(DISTINCT IF(`type` in (4),`type`,NULL) AS v_hours,
GROUP_CONCAT(DISTINCT IF(`type` in (4),start_time,NULL) AS v_start,
GROUP_CONCAT(DISTINCT IF(`type` in (4),end_time,NULL) AS v_end
FROM timesheet_lines
GROUP BY query_key
HAVING (v_hours IS NOT NULL) AND (n_hours IS NOT NULL)
;
The idea here is to concoct a field (query key) that combines all of the uniqueness criteria needed by the report. Once that is available, the report can be obtained without the need for a self-join.
I've had pretty good luck with these sorts of queries. In general, they are faster than self-joins because the table only needs to be scanned once.
Upvotes: 0
Reputation: 186
It looks like the query you've written is answering this question: What are the time sheet details for every day where there are both regular hours and vacation hours logged for the same person?
If you do want to answer the question, "On which dates has an employee submitted vacation hours and normal hours on the same day?", try something like this:
SELECT
a.attendant_id,
a.[date]
FROM timesheet_lines a
WHERE a.[type] IN (1, 2, 5, 6)
AND EXISTS (
SELECT 1
FROM timesheet_lines b
WHERE b.[type] = 4 AND
b.attendant_id = a.attendant_id AND
b.[date] = a.[date])
GROUP BY attendant_id, [date]
Or, if you need to know how many hours of each type were submitted by an employee on the same day, try this:
SELECT
a.attendant_id,
a.[date],
SUM(a.[hours]) AS TotalHoursBilled,
MAX(VacationHoursBilled) VacationHoursBilled --MAX because we're already joining on the fields we're grouping on, so there can be at most one result
FROM timesheet_lines a
INNER JOIN (
SELECT attendant_id, [date], SUM([hours]) AS VacationHoursBilled
FROM timesheet_lines
WHERE [type] = 4
GROUP BY attendant_id, [date]) b ON b.attendant_id = a.attendant_id AND b.[date] = a.[date]
WHERE a.[type] IN (1, 2, 5, 6)
GROUP BY a.attendant_id, a.[date]
Based on your update, it sounds like you need the vacation hour details if normal hours exist. In that case, you can use this:
SELECT attendant_id, [date], start_time, end_time, [hours]
FROM timesheet_lines a
WHERE a.[type] = 4
AND EXISTS (
SELECT 1
FROM timesheet_lines b
WHERE b.[type] IN (1, 2, 5, 6) AND
b.attendant_id = a.attendant_id AND
b.[date] = a.[date])
Upvotes: 2