Reputation: 112
I have a table in an old version of MySQL 5.x
like this:
+---------+------------+------------+
| Task_ID | Start_Date | End_Date |
+---------+------------+------------+
| 1 | 2015-10-15 | 2015-10-16 |
| 2 | 2015-10-17 | 2015-10-18 |
| 3 | 2015-10-19 | 2015-10-20 |
| 4 | 2015-10-21 | 2015-10-22 |
| 5 | 2015-11-01 | 2015-11-02 |
| 6 | 2015-11-17 | 2015-11-18 |
| 7 | 2015-10-11 | 2015-10-12 |
| 8 | 2015-10-12 | 2015-10-13 |
| 9 | 2015-11-11 | 2015-11-12 |
| 10 | 2015-11-12 | 2015-11-13 |
| 11 | 2015-10-01 | 2015-10-02 |
| 12 | 2015-10-02 | 2015-10-03 |
| 13 | 2015-10-03 | 2015-10-04 |
| 14 | 2015-10-04 | 2015-10-05 |
| 15 | 2015-11-04 | 2015-11-05 |
| 16 | 2015-11-05 | 2015-11-06 |
| 17 | 2015-11-06 | 2015-11-07 |
| 18 | 2015-11-07 | 2015-11-08 |
| 19 | 2015-10-25 | 2015-10-26 |
| 20 | 2015-10-26 | 2015-10-27 |
| 21 | 2015-10-27 | 2015-10-28 |
| 22 | 2015-10-28 | 2015-10-29 |
| 23 | 2015-10-29 | 2015-10-30 |
| 24 | 2015-10-30 | 2015-10-31 |
+---------+------------+------------+
If the End_Date
of the tasks are consecutive,
then they are part of the same project.
I am interested in finding the total number of different projects completed.
If there is more than one project that have the same number of completion days,
then order by the Start_Date
of the project.
For this few sample records the expected output would be:
2015-10-15 2015-10-16
2015-10-17 2015-10-18
2015-10-19 2015-10-20
2015-10-21 2015-10-22
2015-11-01 2015-11-02
2015-11-17 2015-11-18
2015-10-11 2015-10-13
2015-11-11 2015-11-13
2015-10-01 2015-10-05
2015-11-04 2015-11-08
2015-10-25 2015-10-31
I am a bit jammed with this. I would really appreciate any help. Thanks.
Upvotes: 1
Views: 383
Reputation: 2838
It's a little tricky problem, but the query below works fine.
It builds two tables, one with Start_Date
and other with End_Date
that NOT IN End_Date
and Start_Date
respectively from Projects
table,
and query these tables fetching Start_Date WHERE Start_Date < End_Date
grouping by Start_Date
using aggregate function MIN
with End_Date
to get a complete Project.
DATEDIFF(MIN(End_Date), Start_Date)
to calculate project_duration and able to order by project_duration
.
SELECT Start_Date, MIN(End_Date) AS End_Date, DATEDIFF(MIN(End_Date), Start_Date) AS project_duration
FROM
(SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) a,
(SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) b
WHERE Start_Date < End_Date
GROUP BY Start_Date
ORDER BY project_duration ASC, Start_Date ASC;
expected output
+------------+------------+---------------+
| Start_Date | End_Date | project_duration |
+------------+------------+---------------+
| 2015-10-15 | 2015-10-16 | 1 |
| 2015-10-17 | 2015-10-18 | 1 |
| 2015-10-19 | 2015-10-20 | 1 |
| 2015-10-21 | 2015-10-22 | 1 |
| 2015-11-01 | 2015-11-02 | 1 |
| 2015-11-17 | 2015-11-18 | 1 |
| 2015-10-11 | 2015-10-13 | 2 |
| 2015-11-11 | 2015-11-13 | 2 |
| 2015-10-01 | 2015-10-05 | 4 |
| 2015-11-04 | 2015-11-08 | 4 |
| 2015-10-25 | 2015-10-31 | 6 |
+------------+------------+---------------+
Upvotes: 0
Reputation: 1270061
This answers -- and answers correctly -- the original version of this question.
Hmmmm . . . I think you can use variables. The simplest way is to generate a sequential number and then subtract this value to get a constant for adjacent rows from the date:
select min(start_date), max(end_date)
from (select t.*, (@rn := @rn + 1) as rn
from (select t.* from tasks t order by end_date) t cross join
(select @rn := 0) params
) t
group by (end_date - interval rn day);
Here is a db<>fiddle.
Upvotes: 2
Reputation: 3592
Following query should work:
select tmp.projectid, date_sub(max(tmp.ed2), interval max(tmp.projectdays) day) start_date,
max(tmp.ed2) end_date,
max(tmp.projectdays) No_Of_ProjectDays
from
(
select t1.task_id tid1, t1.start_date sd1, t1.end_date ed1,
t2.task_id tid2, t2.start_date sd2, t2.end_date ed2,
case when datediff(t2.start_date, ifnull(t1.start_date,'1000-01-01')) != 1
then (@pid := @pid + 1)
else (@pid := @pid)
end as ProjectId,
case when datediff(t2.start_date, ifnull(t1.start_date,'1000-01-01')) != 1
then (@pdays := 1)
else (@pdays := @pdays + 1)
end as ProjectDays
from tasks t1 right join tasks t2
on t2.task_id = t1.task_id + 1
cross join (select @pid :=1, @pdays := 1) vars
) tmp
group by tmp.projectid
order by max(tmp.projectdays), start_date
Please find the Demo here.
EDIT : I have made changes in the query and link according to new data sample. Please have a look.
Upvotes: 2