featuredpeow
featuredpeow

Reputation: 2201

Select rows from a table satisfying criteria for all rows in a child table which have at least one record in another table

We have table "PROCESS" with process_id primary key. Processes have "items" which are stored in another table "ITEM" with (process_id, item_id) as primary key. Each item have "events" which are stored in yet another table "EVENT" with (process_id, item_id, event_id) as primary key. Event have type (stored in "events"."event_type" column) Let's suppose there are events with type "A".

I want to select processes which have at least one event of type "A" for all their items (so, if one item doesn't have such event, I don't need such process in result set).

I ended up with following query:

SELECT needed_processes.process_id FROM (
SELECT items.process_id, items.number_of_items, events.number_of_events    FROM 
(SELECT process.process_id, count(*) number_of_items FROM process
  JOIN item ON process.process_id = item.process_id 
  GROUP BY process.process_id
) items JOIN 
(SELECT needed_events.process_id, count(*) number_of_events FROM 
  (SELECT process.process_id, item.item_id FROM process JOIN item
    ON process.process_id = item.process_id JOIN events ON item.process_id = event.process_id 
    AND item.item_id = event.item_id
    WHERE event.event_type = 'A' 
    group by process.process_id, item.item_id
  ) needed_events group by needed_events.process_id
) events ON items.process_id = events.process_id
where items.number_of_items = events.number_of_events) needed_processes

It counting numbers of items for a process and checks that number of desired events for process equals to it's number of items.

This query is hard to read, hard to understand and doesn't look like efficient one.

Is there a simpler queries (in terms of reading, or in terms of performance) for this task?

I will be fine with oracle-specific queries, database agnostic queries are also welcome.

Examples

Process

|process_id|
|1         |
|2         |
|3         |
|4         |

Item (item always belong only to one process)

|process_id|item_id|
|1         |11     |
|1         |12     |
|1         |13     |
|2         |14     |
|2         |15     |
|3         |16     |

Event (event always belong only to one item)

|process_id|item_id|event_id|event_type|
|1         |11     |21      |A         |
|1         |11     |22      |A         |
|1         |11     |23      |B         |
|1         |13     |24      |A         |
|2         |14     |25      |A         |
|2         |14     |26      |A         |
|2         |15     |27      |A         |
|2         |15     |28      |B         |

Result

|process_id|
|2         |

process_id=1 should be filtered out because it doesn't have event of type A for item 12. It has two events of type A for item 11, but they should be treated as "item 11 has event A". process_id=2 should be returned in result set, because it has events of type A for all his items. It has two events of type A for item 14 and this should not affect result. process_id=3 should not be returned because it doesn't have any events (=> doesn't have event of type A for every of it's items) process_id=4 should not be returned because it doesn't have any items (corner case).

Upvotes: 1

Views: 99

Answers (1)

dnoeth
dnoeth

Reputation: 60462

This returns all processes where there's an event 'A' for every item:

select process_id
from events
group by process_id
having count(distinct item_id)                                     -- all items
     = count(distinct case when event_type = 'A' then item_id end) -- only items with event 'A'

Upvotes: 1

Related Questions