Neo4j : Creating relationship from CSV file is really slow with py2neo

Question

I've tried to load a CSV file (25 Mb size, 150 000 rows) which contains 22 columns into a neo4j graph using py2neo flights modelization.

The cypher query is used in one query and contains nodes and relationships creation between the nodes (Airport, City, Flight and Plane). But when running the code, it takes forever even with USING PERIODIC COMMIT.

I am not sure if the cypher query I've written is optimized, and might be the source of the slowness. For 10 000 rows, it took me around 10 minutes to build the graph... Can anyone help me please ? Here is the code :

def importFromCSVtoNeo(graph):
query = '''
    USING PERIODIC COMMIT 1000
    LOAD CSV WITH HEADERS FROM "file:///flights.csv" AS row FIELDTERMINATOR '	' 
    WITH row 

    MERGE (c_departure:City {cityName: row.cityName_departure}) 
    MERGE (a_departure:Airport {airportName: row.airportName_departure}) 
    MERGE (f_segment1:Flight {airline: row.airline1}) 
    ON CREATE SET f_segment1.class = row.class1, 
                  f_segment1.outboundclassgroup = row.outboundclassgroup1 

    MERGE (a_departure)-[:IN]->(c_departure) 
    MERGE (c_departure)-[:HAS]->(a_departure) 
    MERGE (f_segment1)-[:FROM {departAt: row.outbounddeparttime}]->(a_departure) 

    MERGE (c_transfer:City {cityName: row.transferCityName}) 
    MERGE (a_transfer:Airport {airportName: row.airportName_transfer}) 
    MERGE (f_segment1)-[:TO_TRANSFER {transferArriveAt: row.transferArriveAt}]->(a_transfer) 
    MERGE (a_transfer)-[:IN]->(c_transfer) 
    MERGE (c_transfer)-[:HAS]->(a_transfer) 

    MERGE (c_arrival:City {cityName: row.cityName_arrival}) 
    MERGE (a_arrival:Airport {airportName: row.airportName_arrival}) 
    MERGE (f_segment2:Flight {airline: row.airline2}) 
    ON CREATE SET f_segment2.class = row.class2, 
                  f_segment2.outboundclassgroup = row.outboundclassgroup2 
    MERGE (f_segment2)-[:TO {arrivalAt: row.outboundarrivaltime}]->(a_arrival) 
    MERGE (f_segment2)-[:FROM_TRANSFER {transferDepartAt: row.transferDepartAt}]->(a_transfer) 
    MERGE (a_arrival)-[:IN]->(c_arrival) 
    MERGE (c_arrival)-[:HAS]->(a_arrival) 


    MERGE (p:Plane {saleprice: row.saleprice}) 
    ON CREATE SET p.depart = row.cityName_departure, 
                  p.destination = row.cityName_arrival, 
                  p.salechannel = row.salechannel, 
                  p.planeDuration = row.planeDuration 
    MERGE (p)-[:HAS_FLIGHTS]->(f_segment1) 
    MERGE (f_segment1)-[:WAIT_FOR {waitingTime: row.waitingTime}]->(f_segment2) 
    '''

graph.run(query)


if __name__ == '__main__':
    graph = Graph()
    importFromCSVtoNeo(graph)

I've also tried to do it in a batch mode but the performance doesn't get better... I'll appreciated any comments or suggestion. Thanks !!

Neo4j : Creating relationship from CSV file is really slow with py2neo

Answers (1)

Related Questions