Reputation: 30391
I'm currently doing web development with another developer on a centralized development server. In the past this has worked alright, as we have two separate projects we are working on and rarely conflict. Now, however, we are adding a third (possible) developer into the mix. This is clearly going to create problems with other developers changes affecting my work and vice versa. To solve this problem, I'm thinking the best solution would be to create a virtual machine to distribute between the developers for local use. The problem I have is when it comes to the database.
Given that we all develop on laptops, simply keeping a local copy of the live data is plain stupid.
I've considered sanitizing the data, but I can't really figure out how to replace the real data, with data that would be representative of what people actually enter with out repeating the same information over and over again, e.g. everyone's address becomes 123 Testing Lane, Test Town, WA, 99999 or something. Is this really something to be concerned about? Are there tools to help with this sort of thing? I'm using MySQL. Ideally, if I sanitized the db it should be done from a script that I can run regularly. If I do this I'd also need a way to reduce the size of the db itself. (I figure I could select all the records created after x and whack them and all the records in corresponding tables out so that isn't really a big deal.)
The second solution I've thought of is to encrypt the hard drive of the vm, but I'm unsure of how practical this is in terms of speed and also in the event of a lost/stolen laptop. If I do this, should the vm hard drive file itself be encrypted or should it be encrypted in the vm? (I'm assuming the latter as it would be portable and doesn't require the devs to have any sort of encryption capability on their OS of choice.)
The third is to create a copy of the database for each developer on our development server that they are then responsible to keep the schema in sync with the canonical db by means of migration scripts or what have you. This solution seems to be the simplest but doesn't really scale as more developers are added.
How do you deal with this problem?
Upvotes: 1
Views: 290
Reputation: 2757
You could set up a fixtures (seed data) system. You provide the data once and it gets put into the db as many times as you need. That could be held in source control so that the fixtures are used/updated by all users.
I think that auto-generators are usually a bad idea. It is hard for them to generate information that could be real. Fixtures would allow you to make this information and know that it is what you are looking for. You could also push the bounds of your validators by using fixtures.
It may take a bit of time to set up the first time around, but I think you will get a much higher quality of data that is put in for testing.
Regards,
Justin
Upvotes: 0
Reputation: 10140
As tvanfosson mentioned, use fake data instead of live. Doing so will not only keep the live data safe but also allow you to test different scenarios, such as international names and such.
As for how to distribute your DB, your schema and creation scripts really should be in source control, so each developer can create a local copy of the database as they see fit.
Upvotes: 0
Reputation: 532645
Use fake data -- invest in a data generator if you must, but please don't use real data in a development environment, especially if it's possible that access to it may be compromised. I'm more familiar with tools for MS SQL, but googling for "MySQL data generator" brought up EMS SqlManager and Datanamic.
Upvotes: 2