Milad Soghrati
Milad Soghrati

Reputation: 190

Different results between neo4j .net client and web client

We've been using neo4j high availability cluster on Azure for some months and we are facing an issue with the .net client.

When we connect to the cluster using the web client everything is fine, we can query the nodes and the result appear. But when we use the .net client with the same query, some node are not found.

We tried counting the nodes from the web client and the result is 850 while the .net client returns 620.

I restarted one of the three VMs in the cluster and the problem was solved. We can not figure out what was wrong, and we really don't want our service to be unreliable!

Where you think the problem is?

///Update 1

We are facing problem with a variety of different queries but the simplest one which returns the count of nodes is:

match (t)
return count(t)

and the C# equivalent we are using is:

client.Cypher
  .Match("(t)")
  .Return<int>("count (t)")

The Cypher query returned 850 and the C# code returned 620 nodes. After restarting one of the machines the results became similar both 850 but after a while and adding some nodes again the node count returned by the web client (Cypher query) is 857 and the C# client code returns 856. Meanwhile we've been facing problems adding new nodes with the c# client while the webclient was working well. A VM restart fixed the issue again!

The .net client is neo4jclient

///Update 2

We tried testing Neo4jDotNetDriver using bolt to get the node count and it works fine. We deleted all nodes and tried getting the nodes count with both, the Neo4jClient still returns 857 while the Neo4jDotNetDriver connected over bolt is working fine and returning 0.

We also tried creating some nodes with the neo4jclient and many of them are not created at all while some are!

Upvotes: 0

Views: 147

Answers (1)

Milad Soghrati
Milad Soghrati

Reputation: 190

OK After a lot of testing, we finally figured out were the problem is. There is nothing wrong with the client. As I mentioned earlier we are using a Neo4j High Availability Cluster over Azure with 3 VMs. We tried calling the REST API directly from three different PCs and we were getting wrong responses from one and correct responses from two others. This was where we begin to doubt our requests are being responded from different machines in the cluster. So we tried querying each machine separately using the Neo4j Cypher Shell and we found out one of the machines was out of sync with the other two and that was the one responding with the wrong results. After some research on this issue we found out that this is a bug in neo4j configuration which disables (comments out) the ha.pull_interval making machines go out of sync after a while. We set this interval to 10, restarted the machines and everything is fine now!

You can find more about this issue here, and here

I just don't know why API calls from one client was always being redirected to the out of sync machine, and it would be great if someone knew why.

Upvotes: 0

Related Questions