Jaime
Jaime

Reputation: 2338

How to solve the queue multi-consuming concurrency problem?

Our program is using Queue. Multiple consumers are processing messages.

Consumers do the following:

  1. Receive on or off status message from the Queue.
  2. Get the latest status from the repository.
  3. Compare the state of the repository and the state received from the message.
  4. If the on/off status is different, update the data. (At this time, other related data are also updated.)

Assuming that this process is handled by multiple consumers, the following problems are expected.

  1. Producer sends messages 1: on, 2: off, and 3: on.
  2. Consumer A receives message #1 and stores message #1 in the storage because there is no latest data.
  3. Consumer A receives message #2.
  4. At this time, consumer B receives message #3 at the same time.
  5. Consumers A and B read the latest data from the storage at the same time (message 1).
  6. Consumer B finishes processing first. Don't update the repository as the on/off state is unchanged.(1: on, 3: on)
  7. Then consumer A finishes the processing. The on/off state has changed, so it processes and saves the work. (1: on, 2: off)

In normal case, the latest data remaining in the DB should be on.
(This is because the message was sent in the order of on -> off -> on.)
However, according to the above scenario, off remains the latest data.

Is there any good way to solve this problem? For reference, the queue we use is using AWS Amazon MQ and the storage is using AWS dynamoDB. And using Spring Boot.

Upvotes: 2

Views: 1234

Answers (1)

Justin Bertram
Justin Bertram

Reputation: 35008

The fundamental problem here is that you need to consume these "status" messages in order, but you're using concurrent consumers which leads to race-conditions and out-of-order message processing. In short, your basic architecture using concurrent consumers is causing this problem.

You could possibly work up some kind of solution in the database with timestamps as suggested in the comments, but that would be extra work for the clients and extra data stored in the database that isn't strictly necessary.

The simplest way to solve the problem is to just consume the messages serially rather than concurrently. There are a handful of different ways to do this, e.g.:

  • Define just 1 consumer for the queue with the "status" messages.
  • Use ActiveMQ's "exclusive consumer" feature to ensure that only one consumer receives messages.
  • Use message groups to group all the "status" messages together to ensure they are processed serially (i.e. in order).

Upvotes: 1

Related Questions