M4ks
M4ks

Reputation: 12024

What's are the benefits for using Data Volume Container as a single-database "backend" in Docker?

I am trying to grasp the idea of separate Data Volume Container. In many places I found the being advocated as beneficial (like in this question), however I don't see any point in using separate data container for simple, one database stack. I know:

I clearly see the benefits of such approach in different setups, when the data is to be shared somehow, but as a solution to a problem of "container as a database" it seems like just additional clunk for me.

What am I missing?

Upvotes: 3

Views: 734

Answers (2)

Patrick M
Patrick M

Reputation: 10989

I'm still on the sidelines of Docker development, so this is just educated guessing from what I've read/overheard. Responding to your points out of order:

  • it decouples my database container - but why would I want that? I won't change database image, as I'm not developing a database here.
  • it prevents me from accidentally deleting a container - does it really? It's in no way more protected from deletion that my single all-in-one container is

Say you, or the next guy to maintain this system, come along at some point later and want to upgrade your version of postgres (or whatever your SQL stack is). You decide it would be easier to spin up a new docker container/image for the new version that layer it on top of the old one. If your single docker container has both data and software on it, you don't have that option. Sure, you can still screw up your data volume, but you can't screw up your data by mucking about with a decoupled server container.

  • it allows me to share the data - but again, I have nobody to share it with, just one container using it

You may not be sharing it now; you may not ever have anyone to share it with. But that doesn't mean you won't want to share it between systems, applications, servers, etc.

  • I can easily back it up - just like I can back up the data inside my only container, right?

Sure, but if you're backing up the whole container, you're backing up the software along with the data each time, which is unnecessary. If you're dumping the data out of your container, then you're not really thinking with containers.

One clear advantage I see to having a decoupled data volume is for moving data between environments. Say you want a fresh snapshot of data for testing in your stage environment. Just grab a prod container backup and copy it down. If you want to truncate old data or trim down certain tables to make this manageable, it's pretty easy to imagine a build agent starting up a server container connected to this backup and running some scripts or store procedures (which you might not want to have on your prod containers for security reasons).

You could even have a minimal data container that just has the stub table schemas on it that you use for development and test. You could have a separate all-in-one container for that, but when you need to update your db version, you have to update multiple containers, rather than making one updated/new container and letting it modify the data volumes for you.

Upvotes: 0

larsks
larsks

Reputation: 311606

The big benefit you get from decoupling the data from the actual database software itself is that you can trivially update your database software.

With data external to the database container, you can simply build a new image with newer versions of the software, delete the old database container, and start the new one. You don't need to worry about somehow exporting and importing data. The database image itself is completely stateless.

Another benefit to keeping your data external to the container is that if your storage needs for the database grow large, you can fairly easily move to using a host volume instead of a data-only-container without needing reconfigure storage for all of your containers.

In contrast, if you are storing your data in your database container, your upgrades paths is going to be one of:

  • Treat the container like a vm.

    Log into the container, perform some sort of package upgrade, and restart the datbase service. This works, but is less maintainable because your image is no longer generated directly from a Dockerfile: because you have made manual changes, there is no longer a clear, automated process to rebuild the image to the same state.

  • Copy your data into a new container.

    This is really just extra work. The one benefit to this model is that it provides you with a mechanism by which you can roll back both to an earlier version of the database software and an earlier version of the database content.

Upvotes: 3

Related Questions