Anna
Anna

Reputation: 1

Neo4j loading csv for a very long time

Hi everyone on Stackoverflow,

I have simple query to load csv, but browser takes long time to load and still shows loading sign.

Here is my query:

LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED {rating:line.rating}]->(m) 
RETURN u, m, r

I don't see mistake in the code..

ratings_small.csv has following columns:

userId movieId rating timestamp

1 31 2.5 1260759144

1 1029 3 1260759179

1 1061 3 1260759182

1 1129 2 1260759185

1 1172 4 1260759205

There are 100 004 records.

Kind regards, Anna

Upvotes: 0

Views: 370

Answers (1)

Christophe Willemsen
Christophe Willemsen

Reputation: 20185

I believe you don't have indexes on both :User(name) and :Movie(movieId).

Perform the two following queries and retry your LOAD CSV command, it should then import it in a couple of seconds :

CREATE CONSTRAINT ON (u:User) ASSERT u.name IS UNIQUE;
CREATE CONSTRAINT ON (m:Movie) ASSERT m.moveId IS UNIQUE;

Secondly, I suppose users rate a movie only once, so the MERGE of the relationship is not necessary to happen with the property, you can change your query like this :

LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating
RETURN u, m, r

To avoid Neo4j to try to process the full CSV in one transaction, you can specify the batch size of transactional operations (to avoid memory issues) like this :

USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating
RETURN u, m, r

Lastly, I doubt you have any application or even the Neo4j browser that can visualise everything at once, so I would not return anything from the LOAD CSV query :

USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating

Upvotes: 2

Related Questions