Performance of the transitive closure query in Neo4j

Question

I am trying to compute the transitive closure of an undirected graph in Neo4j using the following Cypher Query ("E" is the label that every edge of the graph has):

MATCH (a) -[:E*]- (b) WHERE ID(a) < ID(b) RETURN DISTINCT a, b

I tried to execute this query on a graph with 10k nodes and around 150k edges, but even after 8 hours it did not finish. I find this surprising, because even the most naive SQL solutions are much faster and I expected that Neo4j would be much more efficient for these kind of standard graph queries. So is there something that I am missing, maybe some tuning of the Neo4j server or a better way to write the query?

Edit

Here is the result of EXPLAINing the above query:

+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
908 ms

Compiler CYPHER 3.3

Planner COST

Runtime INTERPRETED

+-----------------------+----------------+------------------+--------------------------------+
| Operator              | Estimated Rows | Variables        | Other                          |
+-----------------------+----------------+------------------+--------------------------------+
| +ProduceResults       |          14069 | a, b             |                                |
| |                     +----------------+------------------+--------------------------------+
| +Distinct             |          14069 | a, b             | a, b                           |
| |                     +----------------+------------------+--------------------------------+
| +Filter               |          14809 | anon[11], a, b   | ID(a) < ID(b)                  |
| |                     +----------------+------------------+--------------------------------+
| +VarLengthExpand(All) |          49364 | anon[11], b -- a | (a)-[:E*]-(b)                  |
| |                     +----------------+------------------+--------------------------------+
| +AllNodesScan         |          40012 | a                |                                |
+-----------------------+----------------+------------------+--------------------------------+

Total database accesses: ?

Performance of the transitive closure query in Neo4j

Edit

Answers (1)

Related Questions