PainFly
PainFly

Reputation: 55

optimization of mysql query

I have table with two columns (varchar from, varchar to). This table represents connections betwen nodes (from node, to node). I want to get all nodes connected from or to node that I specify and nodes connected from or to those nodes. Currently I use query below that gives me proper results but I'm searching for neater solution.

//currently used query specified node "node1"
SELECT tonode as node
FROM conn
WHERE
fromnode
IN
(SELECT tonode as node FROM conn WHERE fromnode="node1"
UNION
SELECT fromnode as node FROM conn WHERE tonode="node1")
UNION
SELECT fromnode as node
FROM conn
WHERE
tonode
IN
(SELECT tonode as node FROM conn WHERE fromnode="node1"
UNION
SELECT fromnode as node FROM conn WHERE tonode="node1")


//create table for conn table
CREATE TABLE `conn` (
  `fromnode` varchar(70) NOT NULL,
  `tonode` varchar(70) NOT NULL,
  PRIMARY KEY (`fromnode`,`tonode`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=DYNAMIC;
INSERT INTO `conn` (`fromnode`,`tonode`) VALUES
 ('node1','node2'),
 ('node1','node3'),
 ('node3','node2'),
 ('node4','node1'),
 ('node4','node2'),
 ('node4','node5'),
 ('node5','node6'),
 ('node4','node3');

Upvotes: 1

Views: 138

Answers (3)

spencer7593
spencer7593

Reputation: 108420

With these "from" and "to" relationships being bidirectional (you are needing to traverse both directions), there's just no easy statement to do that in MySQL. To get all of the node values in a single result set returned in a single column, the closest I can come to avoiding a UNION operation is:

SELECT CASE 
       WHEN t.i = 1 THEN t.dnode
       WHEN t.i = 2 AND t.dnode = c.fromnode THEN c.tonode
       WHEN t.i = 2 AND t.dnode = c.tonode THEN c.fromnode
       ELSE NULL
       END AS node
  FROM ( SELECT d.i
              , m.root
              , CASE WHEN m.root = n.fromnode THEN n.tonode ELSE n.fromnode END AS dnode
           FROM (SELECT 'node1' AS root) m
          CROSS
           JOIN (SELECT 1 AS i UNION ALL SELECT 2) d
           LEFT
           JOIN conn n ON m.root IN (n.fromnode,n.tonode)
       ) t
  LEFT
  JOIN conn c
    ON t.i = 2 AND t.dnode IN (c.fromnode,c.tonode)
 GROUP BY node
 ORDER BY node

I don't know if I'm even going to be able to unpack that, but I'll try. To avoid having to specify the root node 'node1' multiple times, I use a subquery to return it.

                (SELECT 'node1' AS root) m

Because we are going "two levels deep", I need two sets of nodes, so I create a Cartesian product to double the number of rows I've got, and I'm going to label them 1 for the first level, and 2 for the second level.

          CROSS
           JOIN (SELECT 1 AS i UNION ALL SELECT 2) d

With that, I'm now ready to join to the conn table, and I want any rows that have either a fromnode or tonode value that matches the root node.

           LEFT
           JOIN conn n ON m.root IN (n.fromnode,n.tonode)

With that resultset, I want to "flip" the fromnode and tonode on some of those rows so that we basically always have the "root" node on one side. I do this with a CASE expression that tests which side matches the root:

CASE WHEN m.root = n.fromnode THEN n.tonode ELSE n.fromnode END AS dnode

So now I wrap that resultset as an inline view aliased t. That subquery can be run separately, to see that we're returning what we expect:

SELECT d.i
     , m.root
     , CASE WHEN m.root = n.fromnode THEN n.tonode ELSE n.fromnode END AS dnode
  FROM (SELECT 'node1' AS root) m
 CROSS
  JOIN (SELECT 1 AS i UNION ALL SELECT 2) d
  LEFT
  JOIN conn n ON m.root IN (n.fromnode,n.tonode)

We do need to return that "level" value (d.i we generated earlier, we need it on the next step, when we join to the conn table again, to traverse the next level, and I only need to join those rows where we are going to look at the second level.

  LEFT
  JOIN conn c
    ON t.i = 2 AND t.dnode IN (c.fromnode,c.tonode)

And again, I don't care which side node is on, at this point, I just need to do the match.

At this point, you could run the whole query, and pull t.*, c.* to see what we've got, but I'm going to skip that part and go right to the "magic".

At this point, we can use a CASE expression to "pick out" the node value we want from that mess.

If the level value (unfortunately labeled i) is a 1, then we're looking at the first level, so we just need to get the "nodeX" value that was on the "other" side of the "root". That's available from the t source as expression aliased as dnode.

Otherwise, we're going to look at the rows for the "second" level, i = 2. (In this particular case, the test on i = 2 could be omitted, but it's included hear for completeness, and just in case we are going to extend this approach to get three (gasp!) or more (oh my!) levels.

Here, we need to know which side (from or to) matched the first level, and we just pull the other side. If t.dnode matches on side, we pull the other side.

Finally, we use a GROUP BY to collapse the duplicates.

Since we don't care what level these were from, we omit returning t.i, which would give us the level.


SUMMARY

I don't think this is any more straightforward than your query. And I have no idea how performance would compare. But it's nice to have other statements to compare performance against.

Upvotes: 0

Nir Alfasi
Nir Alfasi

Reputation: 53535

if I understood you correctly (about going only 2 levels deep), you can do something like this:

SELECT level,fromnode , tonode 
FROM conn1 
WHERE level < 3
CONNECT BY PRIOR tonode = fromnode
START WITH fromnode like '%';

Upvotes: 0

Puggan Se
Puggan Se

Reputation: 5846

My optimized version:

SET @origin = "node1";
SELECT DISTINCT
 IF(c1.fromnode = @origin,
   IF(c1.tonode = c2.tonode,
     IF(c2.fromnode = @origin, c2.tonode, c2.fromnode),
     IF(c2.tonode = @origin, c2.fromnode, c2.tonode)
   ),
   IF(c1.fromnode = c2.tonode,
     IF(c2.fromnode = @origin, c2.tonode, c2.fromnode),
     IF(c2.tonode = @origin, c2.fromnode, c2.tonode)
   )
 ) AS node
FROM conn AS c1
LEFT JOIN conn AS c2 ON (c1.fromnode = c2.fromnode OR c1.tonode = c2.fromnode OR c1.fromnode = c2.tonode OR c1.tonode = c2.tonode)
WHERE c1.fromnode = @origin OR c1.tonode = @origin;

the DESCRIBE output of your old query:

+----+--------------------+------------+--------+---------------+---------+---------+------------+------+--------------------------+
| id | select_type        | table      | type   | possible_keys | key     | key_len | ref        | rows | Extra                    |
+----+--------------------+------------+--------+---------------+---------+---------+------------+------+--------------------------+
|  1 | PRIMARY            | conn       | index  | NULL          | PRIMARY | 424     | NULL       |    8 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | conn       | eq_ref | PRIMARY       | PRIMARY | 424     | const,func |    1 | Using where; Using index |
|  3 | DEPENDENT UNION    | conn       | eq_ref | PRIMARY       | PRIMARY | 424     | func,const |    1 | Using where; Using index |
| NULL | UNION RESULT       | <union2,3> | ALL    | NULL          | NULL    | NULL    | NULL       | NULL |                          |
|  4 | UNION              | conn       | index  | NULL          | PRIMARY | 424     | NULL       |    8 | Using where; Using index |
|  5 | DEPENDENT SUBQUERY | conn       | eq_ref | PRIMARY       | PRIMARY | 424     | const,func |    1 | Using where; Using index |
|  6 | DEPENDENT UNION    | conn       | eq_ref | PRIMARY       | PRIMARY | 424     | func,const |    1 | Using where; Using index |
| NULL | UNION RESULT       | <union5,6> | ALL    | NULL          | NULL    | NULL    | NULL       | NULL |                          |
| NULL | UNION RESULT       | <union1,4> | ALL    | NULL          | NULL    | NULL    | NULL       | NULL |                          |
+----+--------------------+------------+--------+---------------+---------+---------+------------+------+--------------------------+

the DESCRIBE output of my query:

+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra                                     |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------+
|  1 | SIMPLE      | c1    | index | PRIMARY       | PRIMARY | 424     | NULL |    8 | Using where; Using index; Using temporary |
|  1 | SIMPLE      | c2    | index | PRIMARY       | PRIMARY | 424     | NULL |    8 | Using index                               |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------------------------------------+

Upvotes: 3

Related Questions