Reputation: 381
I have wikipedia data in a Cassandra table (one row = one wiki article). Now I want to insert this into a graph database so I could see the relations between them. What I tried so far is to get records from Cassandra one by one and add them as nodes in Neo4J but this is very slow. Is there a way using Neo4J/ Titan where the data could be automatically taken from Cassandra and a graph be built?
Upvotes: 1
Views: 355
Reputation: 1997
TL;DR; - there is no ready-to-use tool for your case, but import-tool
exists
So, you want to migrate your data to Neo4j. Fastest way to do this is by using import tool.
Plan:
neo4j-import
tool (bin/
directory), point to your CSV files and import themImport tool is really fast and can handle gigabytes of data.
Upvotes: 1
Reputation: 276
Titan offers bulk loading capability, which is recommended for loading large amounts of data:
http://s3.thinkaurelius.com/docs/titan/1.0.0/bulk-loading.html
Here is an older link that may also help, although some of material is dated:
http://thinkaurelius.com/2014/05/29/powers-of-ten-part-i/
There has to be a "program" of some kind to translate the Wikipedia data into nodes and edges for the property graph. Maybe that is what you mean by "automatically" - asking if such an importer program exists out of the box.
I am not aware of a pre-existing program for Titan for Wikipedia data, although I am sure there is code somewhere. This link might help with Neo4J:
https://github.com/mirkonasato/graphipedia
Upvotes: 2