Reputation: 468
I'm trying to write a single cypher query which will export to csv's: a node list and an edge list which can be analyzed by a library like igraph
. I have my database set up with only a single relationship type:
(a:paper)-[:REFERENCES]->(b:paper)
I have multiple properties on each node (title, author etc.). The unique identifier is paper_id
.
I've been trying to use the apoc functions apoc.export.csv.query
and apoc.export.csv.data
.
I can export the nodes and edges to a single file:
MATCH (n:paper)<-[r:REFERENCES]-(m:paper) WHERE n.paper_id = '1234'
WITH COLLECT(m) AS paper, COLLECT(r) AS references
CALL apoc.export.csv.data(
paper,
references,
'network.csv',
{}
) YIELD file, nodes, relationships
RETURN file, nodes, relationships
Or I can export just the edge list:
MATCH (n:paper)<-[r:REFERENCES]-(m:paper) WHERE n.paper_id = '1234'
CALL apoc.export.csv.data(n.paper_id, m.paper_id, 'edge.csv', {})
WITH n.paper_id AS From, m.paper_id AS To
;
Ideally, I would like a single query which would produce two files.
An edge list:
From | To
1234 | 4567
1234 | 8910
And a node list:
paper_id | title | author
1234 | "a title" | "a name"
4567 | "another title" | "another name"
8910 | "a third title" | "third name"
Neo4j CE 3.4.11
Upvotes: 2
Views: 1032
Reputation: 2905
You don't really have much control over the format of apoc.export.csv.data
(or any of its related functions) - you're pretty much always going to see internal node IDs (and all the other metadata) instead of just your desired unique values.
Still - assuming you can do some futzing about on the import side you can export two files, one with edges and one with nodes.
Assuming you want enough information in the nodes.csv
file to recreate the graph - i.e. you need both the papers that reference m
and you need m
, and using the sample Movies database:
MATCH (movie: Movie { title: 'Top Gun' })<-[acted_in: ACTED_IN]-(actor: Person)
WITH collect(distinct actor) + movie as nodes, collect(distinct acted_in) as relationships
CALL apoc.export.csv.data([], relationships, 'edges.csv', {}) YIELD file as edgefile
CALL apoc.export.csv.data(nodes, [], 'nodes.csv', {}) YIELD file as nodefile
RETURN edgefile, nodefile
This yields two files in the import
folder, with contents as below. It's not clear if this actually achieves what you want, since the only consistent identifier across the two files is the internal node ID (which is sufficient to rebuild an equivalent graph).
nodes.csv
"_id","_labels","born","name","released","tagline","title","_start","_end","_type"
"31",":Person","1959","Val Kilmer","","","",,,
"34",":Person","1961","Meg Ryan","","","",,,
"33",":Person","1933","Tom Skerritt","","","",,,
"30",":Person","1957","Kelly McGillis","","","",,,
"16",":Person","1962","Tom Cruise","","","",,,
"32",":Person","1962","Anthony Edwards","","","",,,
"29",":Movie","","","Top Gun","1986","I feel the need, the need for speed.",,,
edges.csv
"_id","_labels","_start","_end","_type","roles"
,,"16","29","ACTED_IN","[""Maverick""]"
,,"30","29","ACTED_IN","[""Charlie""]"
,,"31","29","ACTED_IN","[""Iceman""]"
,,"32","29","ACTED_IN","[""Goose""]"
,,"33","29","ACTED_IN","[""Viper""]"
,,"34","29","ACTED_IN","[""Carole""]"
Upvotes: 3