Get start and end nodes of specific path in a large graph

Question

I have a large graph (1,068,029 nodes and 2,602,897 relationships), and I work with it via the python API and make requests to the graph in my program flow.

I have the following queries -

First query

MATCH 
(start_node)--(o:observed_data)--(i:indicator)--(m:malware)--(end_node:attack_pattern)
WHERE start_node.id in [id_list] 
RETURN start_node.id, end_node.name

Second query

MATCH 
(start_node)--(o1:observed_data)--(h:MD5)--(o2:observed_data)--(i:indicator)--(m:malware)--(end_node:attack_pattern)
WHERE start_node.id in [id_list] 
RETURN start_node.id, end_node.name

When I trying to preform the first query with id_list of size 75,000 its passes OK and returns the wanted output, but when I trying to preform the second query - the graph gets stuck, even when I decreasing the id_list to 20,000.

The id_list is even larger than 75,000 but I split it into chunks in order to make the graph's response time faster, but if I will split it to too many chunks I will increase the number of requests to the graph, and increase the program run-time.

My question is - Is there a library's function of some sort (APOC or something like that) that performs the same action but in less time? Or maybe you have another solution that solves this problem without decreasing the id_list under 50,000?

Get start and end nodes of specific path in a large graph

First query

Second query

Answers (1)

Related Questions