Nidhi Chachra
Nidhi Chachra

Reputation: 50

Batch processing in cypher Or upload multiple files from Neo4j browser

I am loading data from csv to Neo4j using the following query:

CREATE CONSTRAINT ON (e:Entity) ASSERT e.entity IS UNIQUE;

USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:/file1.csv' AS line FIELDTERMINATOR '|'

WITH line 

MERGE (e0:Entity {entity: line.entities_0_entity})
ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)

MERGE (e1:Entity {entity: line.entities_1_entity})
ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)

MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1)

RETURN *

Could anyone tell the equivalent query to load data from Neo4j command line or a way to change the file name dynamically in browser or pass it like "file:/file*"...??

Upvotes: 1

Views: 1140

Answers (2)

cybersam
cybersam

Reputation: 66999

If you want to process the same Cypher statement multiple times, adjusting one or more values each time, the APOC procedure apoc.periodic.iterate can be used.

In your example, you'd want to perform the CREATE CONSTRAINT statement beforehand (and just once).

For example:

CALL apoc.periodic.iterate(
  "
    WITH ['file1', 'x', 'y'] AS filenames,
    UNWIND filenames AS name
    RETURN name;
  ",
  "
    USING PERIODIC COMMIT 1000
    LOAD CSV WITH HEADERS FROM 'file:/' + {name} + '.csv' AS line FIELDTERMINATOR '|'
    WITH line 
    MERGE (e0:Entity {entity: line.entities_0_entity})
    ON CREATE SET e0.confidence = toFloat(line.entities_0_confidence)
    MERGE (e1:Entity {entity: line.entities_1_entity})
    ON CREATE SET e1.confidence = toFloat(line.entities_1_confidence)
    MERGE (e0)-[r:REL {name: line.relation_relation, confidence: toFloat(line.relation_confidence)}]->(e1);
  ",
  {});

This query will execute the LOAD CSV statement 3 times (sequentially, since the parallel option of the procedure is false by default), passing one of the strings ("file1", "y", and "z") each time as the name parameter.

Upvotes: 4

Christophe Willemsen
Christophe Willemsen

Reputation: 20185

You can simply put all your files in the neo4j's import directory and then use a bash script to load them all :

#!bin/sh

for file in /Users/ikwattro/dev/_graphs/310/import/*
do
    curl -H "Content-Type: application/json" \
        -d '{"statements": [{"statement": "LOAD CSV WITH HEADERS FROM file:///$file AS row ..."}]' \
        http://localhost:7474/db/data/transaction/commit
done

There is no standard way in Neo4j itself to specify multiple files to be imported.

Upvotes: 0

Related Questions