Todd Leo
Todd Leo

Reputation: 45

Is Neo4j capable to store data in HDFS?

Q1: Is it possible to use HDFS as a storage backend for Neo4j?

My raw data is Terabyte large(2TB to 3TB, still under processing hence unable to tell exactly how much vertices and edges), so naturally I'm concerned if Neo4j is still suitable under the situation. our current cluster has 64-cores CPU, 128G RAM per node, whereas the data can't fit in local HDD, unless the graph can be stored in HDFS.

Q2: Will Neo4j benifit performance from HA Cluster mode?

Does HA Cluster only distribute replica on each cluster node, or Neo4j runs queries in parallel to gain high-performance? If latter, does each node holds a copy of the whole graph(let's assume the entire graph is really big) to reduce network IO overhead?

Thanks in advance!

BR, Todd Leo

Upvotes: 2

Views: 835

Answers (1)

MicTech
MicTech

Reputation: 45103

1) It should be possible, but you need to hdfs mount as regular hdd.

But from my point of view it doesn't make sense, because then I/O operation will be very slow. If you compare it to SSD.

2) It increases performance, because you can use multiple machines (slaves) for read operations. - http://neo4j.com/docs/stable/ha-how.html

Upvotes: 0

Related Questions