Reputation: 859

Versioning Mysql Data (Not Just Schema)

There has been talk at my office about creating a package that would version control mysql data (not schemas/migrations).

Basically the process would work like this. Keep in mind the client still uses the backend as normal, image then using it just like a wordpress backend. Client would login select a "branch" give it a name lets say "new users" this would clone a completely new database allowing the user to work on there "branch" without effecting live. Once the client is done making data changes they would merge there data branch into the "master"(live).

Under the hood when merging it would export both live and "new users" branches data to a sql file and do an svn diff and merge the changes.

The situation that arose that enticed the idea was if we have clients that need to make a bunch of changes to there site but dont want to put that data live and while they make changes they dont want to effect other coworkers site changes either. Basically replicated what developers do when working in repositories like Git.

Also if the client works on a dev/demo site the work they do they want to put live.

I wanted to open the discussion to understanding if this is even a good idea? What problems we may run into? Is this a good programming practice when working with data? Does something like this already exist?

Upvotes: 9

Answers (2)

AmirBehzad Eslami

Reputation: 56

Is your requirements as simply as following:

Same back-end code;
Use master data in live;
End-users (or group) to work on isolated data;
No migration, end-users can only modify the data (DML), and are not allowed to modify the schema (DDL);

If these are the requirements, then you can use multiple databases. Consider the following databases in a MySQL server:

masterdb
branch_demo
branch_brian
branch_sandbox
...

These databases share the exact same schema, and only the data is different. In each branch, we have a special table (i.e. dbinfo), to keep track of the parent branch (probably masterdb), create datetime, and other details, like access-level, etc.

id
branchname
parent_branch
created_on
lastmod_on

You can allow end-users to work on separate branches, by simply allowing them to select a particular database in the UI, where masterdb is selected as default, and is used in LIVE.

Creating a new branch, would be as simple as cloning a database;
Merging a newer branch to master, could be handled using the REPLACE statement in MySQL;

If you want to keep track of changes in the data, you can create a special table to log the activities.

Upvotes: 1

VonC

Reputation: 1324228

Database (especially their data) are rarely stored in a version control system because it doesn't scale well for large databases.

In your case, if you have not too much data, that could work, especially since a mysqldump can produce a delimited text format (which has a chance to diff against the previous version)

I would still recommend a separate git repo and a dedicated tool to manage both schema and data changes. For instance, LiquidBase can provide "source control for your database".
You also have, as a dedicated specialized database: off-scale.

If you were to do this manually, then you have good practices summarized in "Recipes for Continuous Database Integration".

As mentioned here, even for schema:

I learned the hard way that applying database schema changes cannot be reliably done without a comprehensive step-by-step plan and, similarly, the order of relationship dependencies are important.
Just storing the "current" or "end" schema is not sufficient. There are many changes that cannot be retroactively applied A->C without knowing A->B->C and some changes B might involve migration logic or corrections.

Upvotes: 3

Versioning Mysql Data (Not Just Schema)

Answers (2)

Related Questions