narendra-choudhary
narendra-choudhary

Reputation: 4826

What happens when Eureka instance skips a heartbeat against a Eureka server with self preservation turned off?

Consider this set-up:

And one of the instances (say srv#1inst#1, an instance of service#1) sent a heartbeat, but it did not reach the Eureka server.

AFAIK, following actions take place in sequence on Server side:

Now on instance (srv#1inst#1) side:

AFAIK, the eviction and registration do not happen immediately. Eureka server runs separate scheduler for both tasks periodically.

I have some questions related to this process:

Upvotes: 4

Views: 607

Answers (1)

narendra-choudhary
narendra-choudhary

Reputation: 4826

This question was answered by qiangdavidliu in one of the issues of eureka's GitHub repository.

I'm adding his explanations here for sake of completeness.


Before I answer the questions specifically, here's some high level information regarding heartbeats and evictions (based on default configs):

  1. instances are only evicted if they miss 3 consecutive heartbeats
  2. (most) heartbeats do not retry, they are best effort every 30s. The only time a heartbeat will retry is that if there is a threadlevel error on the heartbeating thread (i.e. Timeout or RejectedExecution), but this should be very rare.

Let me try to answer your questions:

Are the sequences correct? If not, what did I miss?

A: The sequences are correct, with the above clarifications.

Is the assumption about eviction and registration scheduler correct?

A: The eviction is handled by an internal scheduler. The registration is processed by the handler thread for the registration request.

An instance of service#2 requests fresh registry copy from server right after ServerStep2.

  • Will srv#1inst#1 be in the fresh registry copy, because it has not been evicted yet?
    • If yes, will srv#1inst#1 be marked UP or DOWN?

A: There are a few things here:

  1. until the instance is actually evicted, it will be part of the result
  2. eviction does not involve changing the instance's status, it merely removes the instance from the registry
  3. the server holds 30s caches of the state of the world, and it is this cache that's returned. So the exact result as seem by the client, in an eviction scenario, still depends on when it falls within the cache's update cycle.

The retry request from InstanceStep2 of srv#1inst#1 reaches server right after ServerStep2.

  • Will there be an immediate change in registry?
  • How that will affect the response to instance of service#2's request for fresh registry? How will it affect the eviction scheduler?

A: again a few things:

  1. When the actual eviction happen, we check each evictee's time to see if it is eligible to be evicted. If an instance is able to renew its heartbeats before this event, then it is no longer a target for eviction.
  2. The 3 events in question (evaluation of eviction eligibility at eviction time, updating the heartbeat status of an instance, generation of the result to be returned to the read operations) all happen asynchronously and their result will depend on the evaluation of the above described criteria at execution time.

Upvotes: 1

Related Questions