How to deal with shared state in a micro-service architecture?

In our company we are transitioning from a huge monolithic application to a micro-service architecture. The main technical drivers for this decision were the need to be able to scale services independently and the scalability of development - we've got ten scrum teams working in different projects (or 'micro-services').

The transition process is being smooth and we've already started to benefit from the advantages of this new technical and organizational structures. Now, on the other hand, there is a main point of pain that we are struggling with: how to manage the 'state' of the dependencies between these micro-services.

Let's put an example: one of the micro-services deals with users and registrations. This service (let's call it X) is responsible for maintaining identity information and thus is the main provider for user 'ids'. The rest of the micro-services have a strong dependency on this one. For example, there are some services responsible for user profile information (A), user permissions (B), user groups (C), etc. that rely on those user ids and thus there is a need for maintaining some data sync between these services (i.e. service A should not have info for a userId not registered in service X). We currently maintain this sync by notifying changes of state (new registrations, for example) using RabbitMQ.

As you can imagine, there are many Xs: many 'main' services and many more complicated dependencies between them.

The main issue comes when managing the different dev/testing environments. Every team (and thus, every service) needs to go through several environments in order to put some code live: continuous integration, team integration, acceptance test and live environments.

Obviously we need all services working in all these environments to check that the system is working as a whole. Now, this means that in order to test dependent services (A, B, C, ...) we must not only rely on service X, but also on its state. Thus, we need somehow to maintain system integrity and store a global & coherent state.

Our current approach for this is getting snapshots of all DBs from the live environment, making some transformations to shrink and protect data privacy and propagating it to all environments before testing in a particular environment. This is obviously a tremendous overhead, both organizationally and in computational resources: we have ten continuous integration environments, ten integration environments and one acceptance test environment that all need to be 'refreshed' with this shared data from live and the latest version of the code frequently.

We are struggling to find a better way to ease this pain. Currently we are evaluating two options:

using docker-like containers for all these services
having two versions of each service (one intended for development of that service and one another as a sandbox to be used by the rest of the teams in their development & integration testing)

None of these solutions ease the pain of shared data between services. We'd like to know how some other companies/developers are addressing this problem, as we think this must be common in a micro services architecture.

How are you guys doing it? Do you also have this problem? Any recommendation?

Sorry for the long explanation and thanks a lot!

Upvotes: 20

Answers (3)

neleus

Reputation: 2280

This time I've read your question from different perspective, so here is a 'different opinion'. I know it may be too late but hope it helps with further development.

It looks like shared state is a result of wrong decoupling. In 'right' microservice architecture all microservices have to be isolated functionally rather than logically. I mean all three user profile information (A), user permissions (B), user groups (C) look functionally the same and more or less functionally coherent. They seem to be a single user service with a coherent storage, although it may not look as a micro-service. I don't see here any reasons of decoupling them (or at least you haven't told about them).

Starting from this point, splitting it into smaller independently deployable units may cause more cost and troubles than benefits. There should be a significant reason for that (sometimes political, sometimes just lack of product knowledge)

So the real problem is related to microservice isolation. Ideally each microservice can live as complete standalone product and deliver a well defined business value. When elaborating system architecture we break up it into tiny logical units (A, B, C, etc in your case, or even smaller) and then define functionally coherent subgroups. I can't tell you exact rules of how to do that, perhaps some examples. Complex communication/dependencies between the units, many common terms in their ubiquitous languages so it looks like such units belong to the same functional group and thus to a single service.

So from your example, since there is a single storage you have only way of managing its consistency as you did.

BTW I wonder what actual way you've solved your problem?

Upvotes: 11

Manu

Reputation: 51

Let me try to reformulate the problem:

Actors:

X: UserIds (state of account)
- provide service to get ID (based on credentials) and status of account
A: UserProfile
- Using X to check status of a user account. Stores name along with link to account
- provide service to get/edit name based on ID
B: UserBlogs
- Using X in same way. Stores blog post along with link to account when user writes one
- Using A to search blog post based on user name
- provide service get/edit list of blog entries based on ID
- provide service to search for blog post based on name (relies on A)
C: MobileApp
- wraps features of X, A, B into a mobile app
- provide all services above, relying on well-defined communication contract with all others (following @neleus statement)

Requirements:

Work of teams X, A, B, C need to be uncoupled
Integration environments for X, A, B, C need to be updated with latests features (in order to perform integration tests)
Integration environments for X, A, B, C need to have 'sufficient' set of data (in order to perform load tests, and to find edge cases)

Following @eugene idea: having mocks for each service provided by every team would allow 1) and 2)

cost is more development from the teams
also maintenance of the mocks as well as the main feature
impediment is the fact that you have a monolithic system (you do not have a set of clean well defined/isolated services yet)

Suggested solution:

What about having a shared environment with the set of master data to resolve 3)? Every 'delivered services' (i.e running in production) would be avalailable. Each teams could chose which services they would use from here and which one they would use from their own environment

One immediate drawback I can see is the shared states and consistency of data.

Let's consider automated tests ran against the master data, e.g:

B changes names (owned by A) in order to work on its blog service
- might break A, or C
A changes the status of an account in order to work on some permission scenarios
- might break X, B
C changes all of it on same accounts
- breaks all others

The master set of data would quickly become inconsistent and lose its value for requirement 3) above.

We could therefore add a 'conventional' layer on the shared master data: anyone can read from the full set, but can only modify the objects that they have created ?

Upvotes: 1

Eugene

Reputation: 1895

From my perspective only the objects uses the services should have the state. Let's consider your example: service X responsible for the user Id, service A responsible for the profile information, etc. Lets assume the user Y that has some security token (that may be created for example by using it's user name and password - should be unique) entries to the system. Then the client, contains the user information, sends the security token to the service X. The service X contains info about user ID linked to such token. In case the new user, the service X creates the new ID and stores it's token. Then the service X returns ID to the user object. The user object asks service A about the user profile by providing user ID. Service A takes the ID and asks service X if that ID exists. Service X sends the positive answer then service A may search the profile information by user ID or ask the user to provide such information in order to create it. The same logic should work with the B and C services. They have to talk each with other but they don't need to know about the user state.

Few words about the environments. I would suggest to use puppets. This is the way to automate the service deploy process. We are using the puppets to deploy the services on the different environments. The puppet script is reach and allows flexible configuration.

Upvotes: 0

How to deal with shared state in a micro-service architecture?

Answers (3)

Related Questions