thegunner
thegunner

Reputation: 7153

sql - count column depending on whether parents or children

I'm looking to get the sum of a column in certain conditions i.e. depending on whether a column is a parent with subtasks or just a parent with no subtasks. If a task is a parent with subtasks I only want the sum of the "complete" column of the subtasks. If a task is only a parent I would only want the sum of the complete column of the parent. - but only if the tasks(parent or children) are all related to the same user.

E.g. In the following example table:

UserID | Parent_TaskID | TaskID  | Complete
------ | ------------- | ------  | --------
435    | 149329        | 161280  |   1
435    | 149330        | 210717  |   2
435    | 149330        | 228100  |   3
435    | 156991        | 149330  |   1
169    | 458764        | 546540  |   2
169    | 456842        | 546541  |   2
169    | 456842        | 458764  |   0

TaskID 149330 is a parent with children 210717 & 228100 so the count for that column for 149330 is 5, i.e. ignoring the complete column for the parent. 161280 is a parent so only would return 0 for that. 546540 is a subtask of 458764, so 458764 would have a sum of 2.

So I think what I the result of this should look like:

TaskID  |  Sum_complete
------- |  ------------
161280  |  1
149330  |  5
546541  |  2
458764  |  2

Any ideas how this could be done?

I've created a table at SqlFiddle http://sqlfiddle.com/#!2/8295f

Thanks,

I can get the parents by using the following:

select t.taskID, t.Parent_taskID, t.userID, t.complete 
from task t
where t.Parent_taskID not in (
    select tp.taskID
    from task tp
    where tp.userID = t.userID 
)

Upvotes: 2

Views: 242

Answers (2)

HABO
HABO

Reputation: 15816

Based on your description, I think you are asking for the following (using the sample data from the question):

EDIT: Replaced query to eliminate children from the resultset.

declare @Task as Table
  ( UserId varchar(6), Parent_TaskId varchar(13), TaskId varchar(6), Complete integer );

INSERT INTO @Task ( UserId, Parent_TaskId, TaskId, Complete ) VALUES
    ('435', '149329', '161280', 1 ),
    ('435', '149330', '210717', 2 ),
    ('435', '149330', '228100', 3 ),
    ('435', '156991', '149330', 1 ),
    ('169', '458764', '546540', 2 ),
    ('169', '456842', '546541', 2 ),
    ('169', '456842', '458764', 0 );

; with QualifiedTasks as (
  select UserId, TaskId, Parent_TaskId, Complete,
    case when exists ( select 42 from @Task where Parent_TaskId = O.TaskId ) then 1 else 0 end as Parent,
    case when exists ( select 42 from @Task where O.Parent_TaskId = TaskId ) then 1 else 0 end as Child, -- Unused, but here for completeness.
    case when not exists ( select 42 from @Task where Parent_TaskId = O.TaskId or O.Parent_TaskId = TaskId ) then 1 else 0 end as Loner  
    from @Task as O )
  select L.TaskID, Sum( Coalesce( R.Complete, L.Complete ) ) as Sum_Complete
    from QualifiedTasks as L left outer join
      QualifiedTasks as R on R.Parent_TaskId = L.TaskId and R.UserId = L.UserId
    where L.Parent = 1 or L.Loner = 1
    group by L.TaskId;

A couple of obvious optimizations spring to mind. The Parent, Child and Loner columns are redundant, any of the three can be eliminated. Parent can be determined in the summary query by Max( R.Complete ) is not NULL. A clever trick for Child or Loner escapes me at the moment.

Upvotes: 3

Lamak
Lamak

Reputation: 70638

This should work (though the data you posted on your question is different than the one on the fiddle):

SELECT ISNULL(B.taskId,A.TaskId) TaskId, SUM(A.Complete) Complete
FROM Task A
LEFT JOIN Task B
ON A.Parent_TaskID = B.taskId
GROUP BY ISNULL(B.taskId,A.TaskId)

Upvotes: 0

Related Questions