Reputation: 25
I hold a set of nodes in one mysql table1 and a table of edges in another one (table2). Nodes come with primary keys and edges use this "foreign key"
**table1**
id label
1 node1
2 node2
3 node3
**table2**
FK_first FK_sec rel
1 3 guardian
2 1 guardian
1 3 times
I know the db-design is not perfect, but its simple... Now i want the number of 'rel' for every node and do a query like:
SELECT
label,
COUNT( rel ) as freq
FROM
`table1`
LEFT JOIN table2 ON (id=FK_first OR id=FK_second)
GROUP BY label
ORDER BY freq DESC
I have about 1000 nodes and 2000 edges. A query with ON (id=FK_first OR id=FK_second), then the query is way faster (<1 sec). The other query needs about 6 sec which is ver slow.
I would appreciate some comments to speed this up a bit :-)
LEFT JOIN table2 ON (id=FK_second) ~0.16 sec
LEFT JOIN table2 ON id IN (FK_first,FK_second) ~6 sec
EXPLAIN 1: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE table1 ALL NULL NULL NULL NULL 2571 Using temporary; Using filesort 1 SIMPLE table2 ALL FK_first,FK_second,FK_first_2 NULL NULL NULL 3858 EXPLAIN 2: id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE table1 index NULL PRIMARY 2 NULL 2571 Using index; Using temporary; Using filesort 1 SIMPLE table2 ref FK_first,FK_first_2 FK_first_2 4 table1.id 1
Upvotes: 0
Views: 2366
Reputation: 18290
Try doing two joins and moving the "OR" into the COUNT() function:
For every row, this joins table2 once on FK1, then again on FK2 (if it is not already joined to that row via FK1. Then in the COUNT, we specify that only rows which have either join's rel column as non-null.
SELECT
label,
COUNT( table2A.rel || table2B.rel ) as freq
FROM
`table1`
LEFT JOIN
table2 as table2A
ON id=table2A.FK_first
LEFT JOIN
table2 as table2B
ON id=table2B.FK_second
AND table2A.FKFirst != table2B.FKFirst
GROUP BY label
ORDER BY freq DESC
Upvotes: 3