Babyburger
Babyburger

Reputation: 1830

Blue green deployment with database on Kubernetes - loss of data?

I am reading about blue green deployment with database changes on Kubernetes. It explains very clearly and in detail how the process works:

  1. deploy new containers with the new versions while still directing traffic to the old containers
  2. migrate database changes and have the services point to the new database
  3. redirect traffic to the new containers and remove the old containers when there are no issues

I have some questions particularly about the moment we switch from the old database to the new one.

In step 3 of the article, we have person-v1 and person-v2 services that both still point to the unmodified version of the database (postgres v1):

before database migration

From this picture, having person-v2 point to the database is probably needed to establish a TCP connection, but it would likely fail due to incompatibility between the code and DB schema. But since all incoming traffic is still directed to person-v1 this is not a problem.

We now modify the database (to postgres v2) and switch the traffic to person-v2 (step 4 in the article). I assume that both the DB migration and traffic switch happen at the same time? That means it is impossible for person-v1 to communicate with postgres v2 or person-v2 to communicate with postgres v1 at any point during this transition? Because this can obviously cause errors (i.e. inserting data in a column that doesn't exist yet/anymore).

after database migration

If the above assumption is correct, then what happens if during the DB migration new data is inserted in postgres v1? Is it possible for data to become lost with unlucky timing? Just because the traffic switch happens at the same time as the DB switch, does not mean that any ongoing processes in person-v1 can not still execute DB statements. It would seem to me that any new inserts/deletes/updates would need to propagate to postgres v2 as well for as long as the migration is still in progress.

Upvotes: 1

Views: 730

Answers (2)

Jonas
Jonas

Reputation: 128807

I am reading about blue green deployment with database changes on Kubernetes. It explains very clearly and in detail how the process works

It's an interesting article. But I would not do database migration as described there. And Blue-Green deployment does not make this much easier, you cannot atomically swap the traffic, since replicas will still possibly process requests on the old version - and you don't want to cut on-going requests.

The DB-change must be done in a way so that it does not break the first version of the code. Perhaps this must be done in multiple steps.

Considering the same example, there is multiple different solutions. E.g. first add a view with the new column-names, then deploy a version of the code that uses the view, then change the column-names and finally deploy a newer version of the code that use the new column-names. Alternatively you can add columns with the new column-names besides the old column-names and let the old version of the code still use the old column-names and the new version of code use the new column-names, and finally remove old column-names when there is no running replica of the old code.

As described above, both rolling-upgrades or blue-green deployments can be practiced.

Upvotes: 1

coderanger
coderanger

Reputation: 54191

Even when doing blue-green for the application servers, you still have to follow normal rules of DB schema compatibility. All schema changes need to be backwards compatible for whatever you consider one full release cycle to be. Both services talk to the same DB during the switchover time but thanks to careful planning each can understand the data from the other and all is well.

Upvotes: 1

Related Questions