Alex
Alex

Reputation: 34978

On the fly anonymisation of a MySQL dump

I am using mysqldump to create DB dumps of the live application to be used by developers.

This data contains customer data. I want to anonymize this data, i.e. remove customer names / credit card data.

An option would be:

But this has to much overhead. A better solution would be, to do the anonymization during dump creation.

I guess I would end up parsing all the mysqlsqldump output? Are there any smarter solutions?

Upvotes: 7

Views: 6901

Answers (7)

Pierre POMES
Pierre POMES

Reputation: 41

You can try Myanon: https://myanon.io

Anonymization is done on the fly during dump:

mysqldump | myanon -f db.conf | gzip > anon.sql.gz

Upvotes: 4

matiangul
matiangul

Reputation: 369

I had to develop something similar few days ago. I couldn't do INTO OUTFILE because the db is AWS RDS. I end up with that approach:

Dump data in tabular text form from some table:

mysql -B -e 'SELECT `address`.`id`, "address1" , "address2", "address3", "town", "00000000000" as `contact_number`, "[email protected]" as `email` FROM `address`' some_db > addresses.txt

And then to import it:

mysql --local-infile=1 -e "LOAD DATA LOCAL INFILE 'addresses.txt' INTO TABLE \`address\` FIELDS TERMINATED BY '\t' ENCLOSED BY '\"' IGNORE 1 LINES" some_db

only mysql command is required to do this.

As the export is pretty quick (couple of seconds for ~30.000 rows), the import process is a bit slower, but still fine. I had to join few tables on the way and there was some foreign keys so it will surely be faster if you don't need that. Also if you disable foreign key checks while importing it will also speed up things.

Upvotes: 3

Alex
Alex

Reputation: 34978

phpMyAdmin provides an export option to the SQL format based on SQL queries. It might be an option to extract this code from PHPmyadmin (which is probably well tested) and use it in this application.

Refer to the phpMyAdmin export plugin - exportData method for the code.

Upvotes: 0

PiTheNumber
PiTheNumber

Reputation: 23542

I found to similar questions but it looks like there is no easy solution for what you want. You will have to write a custom export yourself.

Upvotes: 0

Yann Milin
Yann Milin

Reputation: 1355

You can also use the SELECT ... INTO OUTFILE syntax from a SELECT query to make a dump with a column filter.

Upvotes: 1

Colin M
Colin M

Reputation: 13348

Why are you selecting from your tables if you want to randomize the data?

Do a mysqldump of the tables that are safe to dump (configuration tables, etc) with data, and a mysqldump of your sensitive tables with structure only.

Then, in your application, you can construct the INSERT statements for the sensitive tables based on your randomly created data.

Upvotes: 3

Adder
Adder

Reputation: 5868

You could do a select of each table (and not a select *) and specify the columns you want to have and omit or blank those you don't want to have, and then use the export option of phpmyadmin for each query.

Upvotes: 1

Related Questions