Reputation: 637
I'm working on a web-app that's very heavily database driven. I'm nearing the initial release and so I've locked down the features for this version, but there are going to be lots of other features implemented after release. These features will inevitably require some modification to the database models, so I'm concerned about the complexity of migrating the database on each release. What I'd like to know is how much should I concern myself with locking down a solid database design now so that I can release quickly, against trying to anticipate certain features now so that I can build it into the database before release? I'm also anticipating finding flaws with my current model and would probably then want to make changes to it, but if I release the app and then data starts coming in, migrating the data would be a difficult task I imagine. Are there conventional methods to tackle this type of problem? A point in the right direction would be very useful.
For a bit of background I'm developing an asset management system for a CG production pipeline. So lots of pieces of data with lots of connections between them. It's web-based, written entirely in Python and it uses SQLAlchemy with a SQLite engine.
Upvotes: 1
Views: 152
Reputation: 7119
One approach that I saw (and liked) was a table called versions
that contained an id only.
Then there was an updates.sql
script that had a structure similar to this:
DELIMITER $$
CREATE PROCEDURE IF NOT EXISTS DBUpdate()
BEGIN
IF (SELECT id FROM versions) = 1 THEN
CREATE TABLE IF NOT EXISTS new_feature_table(
id INT PRIMARY KEY AUTO-INCREMENT,
blah VARCHAR(128) ...,
);
IF (SELECT id FROM versions) = 2 THEN
CREATE TABLE IF NOT EXISTS newer_feature_table(
id INT PRIMARY KEY AUTO-INCREMENT,
blah VARCHAR(128) ...,
);
END$$
DELIMITER ;
CALL PROCEDURE DBUpdate();
Then you write a python script to check the repository for updates, connect to the db, and run any changes to the schema via this procedure. It's nice because you only need a versions
table with the appropriate id value to build out the entire database (with no data, that is; see ryan1234's answer concerning data backups).
Upvotes: 1
Reputation: 7275
Some thoughts for managing databases for a production application:
Make backups nightly. This is crucial because if you try to do an update (to the data or the schema), and you mess up, you'll need to be able to revert to something more stable.
Create environments. You should have something like a local copy of the database for development, a staging database for other people to see and test before going live and of course a production database that your live system points to.
Make sure all three environments are in sync before you start development locally. This way you can track changes over time.
Start writing scripts and version them for releases. Make sure you store these in a source control system (SVN, Git, etc.) You just want a historical record of what has changed and also a small set of scripts that need to be run with a given release. Just helps you stay organized.
Do your changes to your local database and test it. Make sure you have scripts that do two things, 1) Scripts that modify the data, or the schema, 2) Scripts that undo what you've done in case things go wrong. Test these over and over locally. Run the scripts, test and then rollback. Are things still ok?
Run the scripts on staging and see if everything is still ok. Just another chance to prove your work is good and that if needed you can undo your changes.
Once staging is good and you feel confident, run your scripts on the production database. Remember you have scripts to change data (update, delete statements) and scripts to change schema (add fields, rename fields, add tables).
In general take your time and be very deliberate in your actions. The more disciplined you are the more confident you'll be. Updating the database can be scary, so don't rush things, write out your plan of action, and test, test, test!
Upvotes: 2