Reputation: 3829
I have a relation between task and operations. I want to select the manual tasks without operation OR with a action != 'cancelled' for the last operation.
Example
tasks table :
-------------
|id | state |
| 3 |auto |
| 5 |manual |
| 6 |manual |
| 2 |manual |
| 8 |manual |
operations table :
----------------------------------
|id | action | task_id | time |
| 1 | processed | 8 | 8:00 |
| 2 | cancelled | 8 | 9:00 |
| 3 | processed | 6 | 8:00 |
| 4 | processed | 2 | 8:00 |
| 5 | cancelled | 2 | 7:00 |
result : tasks 5,6,2
Here is what I managed to do :
SELECT tasks.*
WHERE tasks.id NOT IN(
SELECT operations.task_id)
OR tasks.id IN
SELECT operations.task_id
WHERE action!=cancelled
.....
.....
ORDER BY time DESC
I am not sure how to end this query. Also it will perform 2 query on operations for each line in tasks.
Upvotes: 0
Views: 35
Reputation: 180058
If you're willing to rely on 'cancelled' being the last operation performed on a task if ever it is performed -- that is, if cancellation puts a task in a terminal state, then don't need to focus on cancellation being the last operation, only on whether there is a cancellation at all.
In that case, you can use a fairly simple outer join to find your result:
select t.*
from
tasks t
left outer join operations o
on t.id = o.task_id
and o.action = 'cancelled'
where t.state = 'manual' and o.id is null;
Note in particular the o.id is null
in the filter condition -- supposing that 'id' is the primary key for table operations
, it can be null in the joined result only where no row of operations
satisfies the join condition for a given row of tasks
. That is, only for tasks that have no associated 'cancelled' operation (including those with no operations at all).
On the other hand, if you need to include tasks that have a 'cancelled' operation, but for which the last operation is something different, then you might use a window function to help you identify the last operation for each task:
select t.*
from
tasks t
left outer join (
select
operations.*,
max(operations.time) over (partition by task_id) as last_time
from operations
) o
on t.id = o.task_id
and o.action = 'cancelled'
and o.time = o.last_time
where t.state = 'manual' and o.id is null;
It is essential here that the o.time = o.last_time
appear in the join condition, not a filter condition of the inline view (where it would be illegal) or the filter condition of the outer query (where it would have the wrong effect).
Upvotes: 1
Reputation: 3520
Let's say this is your schema:
CREATE TABLE tasks (
id int unique,
state text
);
CREATE TABLE operations (
id serial,
action text,
task_id int REFERENCES tasks (id),
time text
);
INSERT INTO tasks (id, state) VALUES (3, 'auto');
INSERT INTO tasks (id, state) VALUES (5, 'manual');
INSERT INTO tasks (id, state) VALUES (6, 'manual');
INSERT INTO tasks (id, state) VALUES (2, 'manual');
INSERT INTO tasks (id, state) VALUES (8, 'manual');
INSERT INTO operations (action, task_id, time) VALUES ('processed', 8, '8:00');
INSERT INTO operations (action, task_id, time) VALUES ('cancelled', 8, '9:00');
INSERT INTO operations (action, task_id, time) VALUES ('processed', 6, '8:00');
INSERT INTO operations (action, task_id, time) VALUES ('processed', 2, '8:00');
INSERT INTO operations (action, task_id, time) VALUES ('cancelled', 2, '7:00');
This would be your query:
SELECT *
FROM tasks
WHERE state = 'manual' AND
(
NOT EXISTS (SELECT 1 FROM operations WHERE task_id = tasks.id LIMIT 1)
OR
(SELECT action FROM operations WHERE task_id = tasks.id ORDER BY id DESC LIMIT 1) != 'cancelled'
)
;
Output:
id | state
----+--------
5 | manual
6 | manual
(2 rows)
Upvotes: 1
Reputation: 1269463
Hmmm. I'm thinking distinct on
to get the information about the operations:
select distinct on (task_id)
from operations o
order by task_id, time desc;
Then a left join
to get the tasks:
select t.*
from tasks t left join
(select distinct on (task_id)
from operations o
order by task_id, time desc
) o
on o.task_id = t.task_id
where o.task_id is null or o.action <> 'cancelled';
Upvotes: 1