gadget
gadget

Reputation: 1978

Neo4j HA replication issue on 1.9.M01

I'm using Neo4j 1.9.M01 in a Spring MVC application that exposes some domain specific REST services (read, update). The web application is deployed three times into the same web container (Tomcat 6) and each "node" has it's own embedded Neo4j HA instance part of the same cluster.

the three Neo4j config:

#node 1
ha.server_id=1
ha.server=localhost:6361
ha.cluster_server=localhost:5001
ha.initial_hosts=localhost:5001,localhost:5002,localhost:5003

#node 2
ha.server_id=2
ha.server=localhost:6362
ha.cluster_server=localhost:5002
ha.initial_hosts=localhost:5001,localhost:5002,localhost:5003

#node 3
ha.server_id=3
ha.server=localhost:6363
ha.cluster_server=localhost:5003
ha.initial_hosts=localhost:5001,localhost:5002,localhost:5003

Problem: when performing an update on one of the nodes the change is replicated to only ONE other node and the third node stays in the old state corrupting the consistency of the cluster.

I'm using the milestone because it's not allowed to run anything outside of the web container so I cannot rely on the old ZooKeeper based coordination in pre-1.9 versions. Do I miss some configuration here or can it be an issue with the new coordination mechanism introduced in 1.9?

Upvotes: 1

Views: 572

Answers (2)

Mattias Finné
Mattias Finné

Reputation: 3054

This behaviour (replication only to ONE other instance) is the same default as in 1.8. This is controlled by:

ha.tx_push_factor=1

which is the default.

Slaves get updates from master in a couple of ways:

  • By configuring a higher push factor, for example:
   ha.tx_push_factor=2

(on every instance rather, because the one in use is the one on the current master).

  • By configuring pull interval for slaves to fetch updates from its master, for example:
   ha.pull_interval=1s
  • By manually pulling updates using the Java API

  • By issuing a write transaction from the slave

See further at http://docs.neo4j.org/chunked/milestone/ha-configuration.html

Upvotes: 4

Stefan Armbruster
Stefan Armbruster

Reputation: 39915

A first guess would be to set

ha.discovery.enabled = false

see http://docs.neo4j.org/chunked/milestone/ha-configuration.html#_different_methods_for_participating_in_a_cluster for an explanation.

For a full analysis could you please provide data/graph.db/messages.log from all three cluster members.

Side note: It should be possible to use 1.8 also for your requirements. You could also spawn zookeeper directly from tomcat, just mimic what bin/neo4j-coordinator does: run class org.apache.zookeeper.server.quorum.QuorumPeerMain in a seperate thread upon startup of the web application.

Upvotes: 1

Related Questions