asciipip
asciipip

Reputation: 519

With SQL, how can I find non-matches in a single table many-to-many relation?

I have a database that, among other things, records the results of reactions between ingredients. It's currently structured with the following three tables:

| Material       |
|----------------|
| id : Integer   |
| name : Varchar |

| Reaction        |
|-----------------|
| id : Integer    |
| <other details> |

| Ingredient            |
|-----------------------|
| material_id : Integer |
| reaction_id : Integer |
| quantity : Real       |

This maps the many-to-many relationship between materials and reactions.

I would like to run a query that returns every pair of materials that do not form a reaction. (i.e. every pair (x, y) such that there is no reaction that uses exactly x and y and no other materials.) In other circumstances, I would do this with a LEFT JOIN onto the intermediate table and then look for NULL reaction_ids. In this case, I'm getting the pairs by doing a CROSS JOIN on the materials table and itself, but I'm not sure how (or whether) doing two LEFT JOINs onto the two materials aliases can work.

How can this be done?

I'm most interested in a generic SQL approach, but I'm currently using SQLite3 with SQLAlchemy. I have the option of moving the database to PostgreSQL, but SQLite is strongly preferred.

Upvotes: 0

Views: 54

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

Use a cross join to generate the list and then remove the pairs that are in the same reaction.

select m.id, m2.id as id2
from materials m cross join
     materials m2
where not exists (select 1
                  from ingredient i join
                       ingredient i2
                       on i.reaction_id = i2.reaction_id and
                          i.material_id = m.id and
                          i2.material_id = m2.id
                 );

Although this query looks complicated, it is essentially a direct translation of your question. The where clause is saying that there are not two ingredients for the same reaction that have each of the materials.

For performance, you want an index on ingredient(reaction_id, material_id).

EDIT:

If you like, you can do this without an exists, using a left join and where:

select m.id, m2.id
from materials m cross join
     materials m2 left join
     ingredients i
     on i.material_id = m.id left join
     ingredients i2
     on i2.material_id = m2.id and
        i2.reaction_id = m2.reaction_id
where i2.reaction_id is null;

Upvotes: 2

Related Questions