Reputation: 4302
My Graph consists of Thing, Link and Tag nodes. A Thing can be :CONNECT ed to Link s and Tag s. The graph is around 45GB in size with 4mio nodes and 84mio relationships. Nodes consist only of one field with an key for external system reference.
I am looking for Thing s (t2) in my Graph that are strongly connected to another Thing (t1) via a Link.
The Cypher query is for example:
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})
MATCH p=(t1)-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)
WHERE t1<>t2 WITH t2, count(DISTINCT p) AS links
RETURN t2, links
ORDER BY links DESC
LIMIT 25
Result:
+------------------------------------------------------------------+
| t2 | links |
+------------------------------------------------------------------+
| Node[166590]{key:"4b06471e-0849-4e56-b5c9-6bc04730c899"} | 854 |
| Node[190480]{key:"8934c635-17de-449a-9437-a24857b8b1c6"} | 698 |
...
| Node[754437]{key:"925b4f3a-0c69-46b9-8f35-294b9539a98b"} | 345 |
| Node[656436]{key:"50424c32-8ce8-495b-a8d3-f4864b1c3adc"} | 342 |
+------------------------------------------------------------------+
25 rows
18232 ms
Performance is not very very amazing. But that is not the main problem for me at the moment.
I like now to look for Thing s (t2) in my Graph that are strongly connected to another Thing (t1) via a Link and connected to a specific Tag.
MATCH (tag:`Tag` {key: "da8115ff-95fb-46d7-bd14-24bdd16f0a04"})
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})
MATCH p=(t1)-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)-[:CONNECT]->(tag)
WHERE t1<>t2
RETURN t2, count(DISTINCT p) AS links
ORDER BY links DESC
LIMIT 25
The problem is, that this query runs forever. In my opinion this should run faster than the first Cpyher query because the part of the graph I am interested in is much smaller?
I am using Neo4j 2.1.2 running on a 80GB SDD, 16GB RAM for Java, 16GB RAM for the OS with 4 cores.
Upvotes: 1
Views: 115
Reputation: 41676
I think having 2 different relationship-types would help.
Do you have an index or constraint on :Tag(key)
?
How many paths are returned when you don't limit / order them? What's the performance difference if you just use limit without order?
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->
(l:Link)<-[:CONNECT]-(t2:Thing)
RETURN t2, count(DISTINCT p) AS links
ORDER BY links DESC LIMIT 25
Can you try the new, experimental query planer on your query?
CYPHER 2.1.experimental
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->
(l:Link)<-[:CONNECT]-(t2:Thing)
RETURN t2, count(DISTINCT p) AS links
ORDER BY links DESC LIMIT 25
Update
Try this:
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l)
WITH distinct l,t1
MATCH (l)<-[:CONNECT]-(t2)
WHERE t1<>t2
RETURN t2, count(distinct l) AS links
ORDER BY links DESC
LIMIT 25
Upvotes: 1
Reputation: 4302
My fastest version after watching http://vimeo.com/84900121 is:
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})
MATCH (t1)-[:CONNECT]->(l:Link) MATCH (l)<-[:CONNECT]-(t2:Thing)
WHERE t1<>t2 WITH t2, count(l) AS links
ORDER BY links DESC LIMIT 25 RETURN t2, links
and takes 7520 ms.
Using the new, experimental query planer:
CYPHER 2.1.experimental
MATCH (t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})
MATCH (t1)-[:CONNECT]->(l:Link)
MATCH (l)<-[:CONNECT]-(t2:Thing)
WHERE t1<>t2
WITH t2, count(l) AS links
ORDER BY links DESC LIMIT 25
RETURN t2, links
takes 11942 ms
Your query
MATCH p=(t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)
RETURN t2, count(DISTINCT p) AS links
ORDER BY links DESC LIMIT 25
takes 13315 ms removing ORDER BY does not have an effect.
Using the new, experimental query planer:
CYPHER 2.1.experimental
MATCH p=(t1:Thing {key: "a80b3828-6fec-4031-b552-d3397d1737b7"})-[:CONNECT]->(l:Link)<-[:CONNECT]-(t2:Thing)
RETURN t2, count(DISTINCT p) AS links
ORDER BY links DESC LIMIT 25
takes 21200 ms.
A typical result looks like this:
+------------------------------------------------------------------+
| t2 | links |
+------------------------------------------------------------------+
| Node[166590]{key:"4b06471e-0849-4e56-b5c9-6bc04730c899"} | 854 |
| Node[190480]{key:"8934c635-17de-449a-9437-a24857b8b1c6"} | 698 |
| Node[153785]{key:"476f9756-e829-495c-82a2-6d05efa24a5a"} | 665 |
| Node[217705]{key:"4e69f054-3828-4c9c-829d-bc1b94499d26"} | 595 |
| Node[434466]{key:"372832f9-338b-4818-b7df-f60ec68e0dbc"} | 586 |
| Node[188948]{key:"ed023053-9c4d-4978-b884-cd7a77f9d610"} | 580 |
| Node[129093]{key:"49dfa255-8455-4f90-962d-63e9244b4337"} | 576 |
| Node[249272]{key:"fbdcbba8-c47d-4e4d-bae1-0ebadf6e7692"} | 553 |
| Node[253605]{key:"4c7848aa-eff7-4cd8-ab1e-79cb9a17fe59"} | 533 |
| Node[149177]{key:"9e5e830e-8ebd-456d-9e0c-094b5fc6964b"} | 519 |
| Node[155016]{key:"0d1148ff-3040-4c45-8198-a5918dd96721"} | 503 |
| Node[346978]{key:"640812b0-996e-4a1b-9a55-80417b010403"} | 478 |
| Node[256312]{key:"a16b233f-e363-4883-bc4c-9ccb1e4d1738"} | 463 |
| Node[203843]{key:"e739384f-0b1d-4f47-9abb-debfdc438796"} | 421 |
| Node[632642]{key:"8788d644-11ff-4a49-805e-32268183a712"} | 420 |
| Node[222103]{key:"eb4cdb73-67b7-46aa-a26f-1612a855a7d4"} | 419 |
| Node[156930]{key:"1ee362fb-0308-45e4-9dbc-1286e4571cb9"} | 414 |
| Node[445506]{key:"c1d50c68-748c-4551-ab4f-7313bf38ec70"} | 407 |
| Node[205489]{key:"fcf2f12a-b6cf-4d74-afbc-b75f1e57dd40"} | 384 |
| Node[134227]{key:"d44a2c49-35dd-44c1-82cb-c1386bb1beec"} | 368 |
| Node[360284]{key:"8fbed168-9183-455e-8869-350fbf9310f4"} | 365 |
| Node[223032]{key:"4424b8e8-d4aa-43dc-8b21-d6c119645092"} | 359 |
| Node[769577]{key:"8b538f97-8152-4552-aa6a-eadd78cfbe9c"} | 358 |
| Node[754437]{key:"925b4f3a-0c69-46b9-8f35-294b9539a98b"} | 345 |
| Node[656436]{key:"50424c32-8ce8-495b-a8d3-f4864b1c3adc"} | 342 |
+------------------------------------------------------------------+
Changing the Relationship Type will take some time :) So I will come back later with numbers for that.
Upvotes: 0