Reputation: 165
I have these 2 tables:
users
| id | name |
--------------
| 1 | user1 |
| 2 | user2 |
| 3 | user3 |
| 4 | user4 |
| 5 | user4 |
user_tasks
| id | user_id | status | date_created |
---------------------------------------------------
| 1 | 1 | started | 2020-08-04 00:00:00 |
| 2 | 1 | pending | 2020-08-04 00:00:00 |
| 3 | 1 | pending | 2020-08-04 00:00:00 |
| 4 | 1 | pending | 2020-08-04 00:00:00 |
| 5 | 1 | validated | 2020-08-04 00:00:00 |
| 6 | 1 | ended | 2020-08-04 00:00:00 |
| 7 | 2 | started | 2020-08-04 00:00:00 |
| 8 | 2 | pending | 2020-08-04 00:00:00 |
| 9 | 2 | pending | 2020-08-04 00:00:00 |
| 10 | 2 | pending | 2020-08-04 00:00:00 |
| 11 | 2 | ended | 2020-08-04 00:00:00 |
| 12 | 3 | started | 2020-08-04 00:00:00 |
| 13 | 3 | pending | 2020-08-04 00:00:00 |
| 14 | 3 | pending | 2020-08-04 00:00:00 |
| 15 | 3 | pending | 2020-08-04 00:00:00 |
| 16 | 3 | ended | 2020-08-04 00:00:00 |
I would like to get all users that have a task with the statuses pending
and ended
, and NOT the validated
status.
Simply put; I would like a result with which users
have NOT validated
a user_task
.
In the case above, it should return users.id 2, 3
since they have not validated a task.
How would I go about doing this? So far I came up with this:
SELECT *
FROM
(
SELECT id
FROM users
JOIN user_tasks ON users.id = user_tasks.user_id
WHERE user_tasks.status = 'pending'
GROUP BY id
) as query_1
JOIN
(
SELECT id
FROM users
JOIN user_tasks ON users.id = user_tasks.user_id
WHERE user_tasks.status != 'validated'
GROUP BY id
) as query_2
ON query_1.id = query_2.id
But this query also returns the users
who have validated a task.
Also, I am not sure how to properly pose this question. If anyone has any feedback for a better title, please feel free to edit.
Upvotes: 0
Views: 27
Reputation: 1269753
specifically the users that do not have a task with validated status.
If I understand correctly, you can use not exists
:
select u.*
from users u
where not exists (select 1
from user_tasks ut
where ut.user_id = u.id and ut.status = 'validated'
);
EDIT:
Based on your comment, I would go for aggregation:
select ut.user_id
from user_tasks ut
group by ut.user_id
having sum( ut.status = 'validated' ) = 0 and
sum( ut.status = 'pending' ) > 0 and
sum( ut.status = 'ended' ) > 0 ;
Upvotes: 1
Reputation: 1630
This is probably how I would do it if that was all I wanted out of the data and I wanted to do it quick:
with pending_users as (
select
user_id
from
user_tasks
where
status = 'pending'
)
select
distinct
u.name
from
user_tasks ut
join users u on ut.user_id = u.id
where
ut.status = 'validated'
and
u.id in (select user_id from pending_users)
Maybe not the most efficient or elegant, but it is the first approach to this query that pops out of my head.
More likely I'd want to do deeper analysis, and maybe build a "funnel" data viz which showed each step and where users were dropping off. That would take a little bit more time than a quick off the top of my head approach. I'd probably do an array_agg
and then sort the arrays and follow up with a count of the length of the arrays, and maybe a last_value()
window function to see where things stand for that user.
Upvotes: 0