yogk
yogk

Reputation: 381

Loading Cassandra data into Titan/ Neo4J

I have wikipedia data in a Cassandra table (one row = one wiki article). Now I want to insert this into a graph database so I could see the relations between them. What I tried so far is to get records from Cassandra one by one and add them as nodes in Neo4J but this is very slow. Is there a way using Neo4J/ Titan where the data could be automatically taken from Cassandra and a graph be built?

Upvotes: 1

Views: 355

Answers (2)

FylmTM
FylmTM

Reputation: 1997

Neo4j

TL;DR; - there is no ready-to-use tool for your case, but import-tool exists

So, you want to migrate your data to Neo4j. Fastest way to do this is by using import tool.

Plan:

  • Dump your data from Cassandra to CSV files.
  • Download neo4j and extract somewhere
  • Use neo4j-import tool (bin/ directory), point to your CSV files and import them

Import tool is really fast and can handle gigabytes of data.

Upvotes: 1

drobin
drobin

Reputation: 276

Titan offers bulk loading capability, which is recommended for loading large amounts of data:

http://s3.thinkaurelius.com/docs/titan/1.0.0/bulk-loading.html

Here is an older link that may also help, although some of material is dated:

http://thinkaurelius.com/2014/05/29/powers-of-ten-part-i/

There has to be a "program" of some kind to translate the Wikipedia data into nodes and edges for the property graph. Maybe that is what you mean by "automatically" - asking if such an importer program exists out of the box.

I am not aware of a pre-existing program for Titan for Wikipedia data, although I am sure there is code somewhere. This link might help with Neo4J:

https://github.com/mirkonasato/graphipedia

Upvotes: 2

Related Questions