hellomichibye
hellomichibye

Reputation: 4302

Cypher query gets slower if the part of the graph gets smaller

My Graph consists of Thing, Link and Tag nodes. A Thing can be :CONNECT ed to Link s and Tag s. The graph is around 45GB in size with 4mio nodes and 84mio relationships. Nodes consist only of one field with an key for external system reference.

I am looking for Thing s (t2) in my Graph that are strongly connected to another Thing (t1) via a Link.

The Cypher query is for example:

MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"}) 
MATCH p=(t1)-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing) 
WHERE t1<>t2 WITH t2, count(DISTINCT p) AS links 
RETURN t2, links 
ORDER BY links DESC 
LIMIT 25

Result:

+------------------------------------------------------------------+
| t2                                                       | links |
+------------------------------------------------------------------+
| Node[166590]{key:"4b06471e-0849-4e56-b5c9-6bc04730c899"} | 854   |
| Node[190480]{key:"8934c635-17de-449a-9437-a24857b8b1c6"} | 698   |
...
| Node[754437]{key:"925b4f3a-0c69-46b9-8f35-294b9539a98b"} | 345   |
| Node[656436]{key:"50424c32-8ce8-495b-a8d3-f4864b1c3adc"} | 342   |
+------------------------------------------------------------------+
25 rows
18232 ms

Performance is not very very amazing. But that is not the main problem for me at the moment.

I like now to look for Thing s (t2) in my Graph that are strongly connected to another Thing (t1) via a Link and connected to a specific Tag.

MATCH (tag:`Tag` {key: "da8115ff-95fb-46d7-bd14-24bdd16f0a04"}) 
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"}) 
MATCH p=(t1)-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)-[:CONNECT]->(tag) 
WHERE t1<>t2 
RETURN t2, count(DISTINCT p) AS links 
ORDER BY links DESC 
LIMIT 25

The problem is, that this query runs forever. In my opinion this should run faster than the first Cpyher query because the part of the graph I am interested in is much smaller?

I am using Neo4j 2.1.2 running on a 80GB SDD, 16GB RAM for Java, 16GB RAM for the OS with 4 cores.

Upvotes: 1

Views: 115

Answers (2)

Michael Hunger
Michael Hunger

Reputation: 41676

I think having 2 different relationship-types would help.

Do you have an index or constraint on :Tag(key) ?

How many paths are returned when you don't limit / order them? What's the performance difference if you just use limit without order?

MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->
      (l:Link)<-[:CONNECT]-(t2:Thing) 
RETURN t2, count(DISTINCT p) AS links 
ORDER BY links DESC LIMIT 25

Can you try the new, experimental query planer on your query?

CYPHER 2.1.experimental
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->
      (l:Link)<-[:CONNECT]-(t2:Thing) 
RETURN t2, count(DISTINCT p) AS links 
ORDER BY links DESC LIMIT 25

Update

Try this:

MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l)
WITH distinct l,t1
MATCH (l)<-[:CONNECT]-(t2)  
WHERE t1<>t2  
RETURN t2, count(distinct l) AS links  
ORDER BY links DESC 
LIMIT 25  

Upvotes: 1

hellomichibye
hellomichibye

Reputation: 4302

My fastest version after watching http://vimeo.com/84900121 is:

MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"}) 
MATCH (t1)-[:CONNECT]->(l:Link) MATCH (l)<-[:CONNECT]-(t2:Thing) 
WHERE t1<>t2 WITH t2, count(l) AS links 
ORDER BY links DESC LIMIT 25 RETURN t2, links

and takes 7520 ms.

Using the new, experimental query planer:

CYPHER 2.1.experimental 
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})  
MATCH (t1)-[:CONNECT]->(l:Link)  
MATCH (l)<-[:CONNECT]-(t2:Thing)  
WHERE t1<>t2  
WITH t2, count(l) AS links  
ORDER BY links DESC LIMIT 25  
RETURN t2, links

takes 11942 ms

Your query

MATCH p=(t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)  
RETURN t2, count(DISTINCT p) AS links  
ORDER BY links DESC LIMIT 25

takes 13315 ms removing ORDER BY does not have an effect.

Using the new, experimental query planer:

CYPHER 2.1.experimental  
MATCH p=(t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)  
RETURN t2, count(DISTINCT p) AS links  
ORDER BY links DESC LIMIT 25

takes 21200 ms.

A typical result looks like this:

+------------------------------------------------------------------+
| t2                                                       | links |
+------------------------------------------------------------------+
| Node[166590]{key:"4b06471e-0849-4e56-b5c9-6bc04730c899"} | 854   |
| Node[190480]{key:"8934c635-17de-449a-9437-a24857b8b1c6"} | 698   |
| Node[153785]{key:"476f9756-e829-495c-82a2-6d05efa24a5a"} | 665   |
| Node[217705]{key:"4e69f054-3828-4c9c-829d-bc1b94499d26"} | 595   |
| Node[434466]{key:"372832f9-338b-4818-b7df-f60ec68e0dbc"} | 586   |
| Node[188948]{key:"ed023053-9c4d-4978-b884-cd7a77f9d610"} | 580   |
| Node[129093]{key:"49dfa255-8455-4f90-962d-63e9244b4337"} | 576   |
| Node[249272]{key:"fbdcbba8-c47d-4e4d-bae1-0ebadf6e7692"} | 553   |
| Node[253605]{key:"4c7848aa-eff7-4cd8-ab1e-79cb9a17fe59"} | 533   |
| Node[149177]{key:"9e5e830e-8ebd-456d-9e0c-094b5fc6964b"} | 519   |
| Node[155016]{key:"0d1148ff-3040-4c45-8198-a5918dd96721"} | 503   |
| Node[346978]{key:"640812b0-996e-4a1b-9a55-80417b010403"} | 478   |
| Node[256312]{key:"a16b233f-e363-4883-bc4c-9ccb1e4d1738"} | 463   |
| Node[203843]{key:"e739384f-0b1d-4f47-9abb-debfdc438796"} | 421   |
| Node[632642]{key:"8788d644-11ff-4a49-805e-32268183a712"} | 420   |
| Node[222103]{key:"eb4cdb73-67b7-46aa-a26f-1612a855a7d4"} | 419   |
| Node[156930]{key:"1ee362fb-0308-45e4-9dbc-1286e4571cb9"} | 414   |
| Node[445506]{key:"c1d50c68-748c-4551-ab4f-7313bf38ec70"} | 407   |
| Node[205489]{key:"fcf2f12a-b6cf-4d74-afbc-b75f1e57dd40"} | 384   |
| Node[134227]{key:"d44a2c49-35dd-44c1-82cb-c1386bb1beec"} | 368   |
| Node[360284]{key:"8fbed168-9183-455e-8869-350fbf9310f4"} | 365   |
| Node[223032]{key:"4424b8e8-d4aa-43dc-8b21-d6c119645092"} | 359   |
| Node[769577]{key:"8b538f97-8152-4552-aa6a-eadd78cfbe9c"} | 358   |
| Node[754437]{key:"925b4f3a-0c69-46b9-8f35-294b9539a98b"} | 345   |
| Node[656436]{key:"50424c32-8ce8-495b-a8d3-f4864b1c3adc"} | 342   |
+------------------------------------------------------------------+

Changing the Relationship Type will take some time :) So I will come back later with numbers for that.

Upvotes: 0

Related Questions