Alex Avrutin
Alex Avrutin

Reputation: 1391

Apache Ignite unable find a deployed service

I've noticed a strange behaviour of Apache Ignite which occurs fairly reliably on my 5-node Apache Ignite cluster but can be replicated with even a two node cluster. I use Apache Ignite 2.7 for Net in the Linux environment deployed in a Kubernetes cluster (each pod hosts one node).

The problem as follows. Assume we've got a cluster which consists of 2 Apache Ignite nodes, A and B. Both nodes start and initialize. A couple of Ignite Services are deployed on each node during the initialization phase. Among all, a service named QuoteService is deployed on the node B.

So far so good. The cluster works as expected. Then, the node B crashes or gets stopped for whatever reason and then restarts. All the ignite services hosted on the node B get redeployed. The node rejoins the cluster.

However, when a service on the node A is trying to call the QuoteService expected to be available on the node B, an exception gets thrown with the following message: Failed to find deployed service: QuoteService. It is strange as the line registering the service did run during the restart of the node B:

services.DeployMultiple("QuoteGenerator", new Services.Ignite.QuoteGenerator(), 8, 2);

(deploying the service as singleton does not make any difference)

A restart of either node A or node B separately does not help. The problem can only be resolved by shutting down the entire Ignite cluster and restarting all the nodes.

This condition can be reproduced even when 5 nodes are running.

This bug report may look a bit unspecific but it is hard to specify the concrete reproduce steps as the replication involves setting up at least two ignite nodes and stopping and restarting them in a sequence. So let me pose the questions this way: 1. Have you ever noticed such a condition or did you received similar reports from other users? 2. If so, what steps can you recommend to address this problem? 3. Should I wait for the next version of Apache Ignite as I read that the service deployment mechanism is currently being overhauled?

UPD: Getting a similar problem on a running cluster even if I don't stop/start nodes. I will open another question on SA and it seems to have a different genesis.

Upvotes: 1

Views: 759

Answers (1)

Alex Avrutin
Alex Avrutin

Reputation: 1391

I've figured out what caused the described behavior (although I don't understand why exactly).

I wanted to ensure that the Ignite service is only deployed on the current node so I used the following C# code to deploy the service:

var services = ignite.GetCluster().ForLocal().GetServices();
services.DeployMultiple("FlatFileService", new Services.Ignite.FlatFileService(), 8, 2);

When I changed my code to rely only on a NodeFilter to limit the deployment of the service to a specific set of nodes and got rid of "GetCluster().ForLocal().", the bug disappeared. The final code is as follows:

var flatFileServiceCfg = new ServiceConfiguration
{
    Service = new Services.Ignite.FlatFileService(),
    Name = "FlatFileService",
    NodeFilter = new ProductServiceNodeFilter(),
    MaxPerNodeCount = 2,
    TotalCount = 8
};
var services = ignite.GetServices();
services.DeployAll(new[] { flatFileServiceCfg, ... other services... });

It is still strange, however, why the old code did work until the topology changed.

Upvotes: 0

Related Questions