matwasilewski
matwasilewski

Reputation: 426

Cypher apoc.export.json.query is painstakingly slow

I'm trying to export subgraph (all nodes and relationships on some path) from neo4j to json.

I'm running a Cypher export query with

WITH "{cypher_query}" AS query CALL apoc.export.json.query(query, "filename.jsonl", {}) YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data 
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data;

Where cypher_query is

MATCH p = (ancestor: Term {term_id: 'root_id'})<-[:IS_A*..]-(children: Term) WITH nodes(p) as term, relationships(p) AS r, children AS x RETURN term, r, x"

Ideally, I'd have the json be triples of subject, relationship, object of (node1, relationship between nodes, node2) - my understanding is that in this case I'm getting more than two nodes per line because of the aggregation that I use.

It takes more than two hours to export something like 80k nodes and it would be great to speed up this query.

  1. Would it benefit from being wrapped in apoc.periodic.iterate? I thought apoc.export.json.query is already optimized with this regard, but maybe I'm wrong.
  2. Would it benefit from replacing the path-matching query in standard cypher syntax with some apoc function?
  3. Is there a more efficient way of exporting a subgraph from a neo4j database to json? I thought that maybe creating a graph object and exporting it would work but have no clue where the bottleneck is here and hence don't know how to progress.

Upvotes: 1

Views: 344

Answers (1)

Graphileon
Graphileon

Reputation: 5385

You could try this (although I do not see why you would need the rels in the result, unless they have properties)

// limit the number of paths
MATCH p = (root: Term {term_id: 'root_id'})<-[:IS_A*..]-(leaf: Term) 
WHERE NOT EXISTS ((leaf)<-[:IS_A]-())

// extract all relationships
UNWIND relationships(p) AS rel

// Return what you need (probably a subset of what I indicated below, eg. some properties)
RETURN startNode(rel) AS child, 
       rel,
       endNode(rel) AS parent
 

Upvotes: 1

Related Questions