BeetleJuice
BeetleJuice

Reputation: 40896

mysqldump: how to fetch dependent rows

I'd like a snapshot of a live MySQL DB to work with on my development machine. The problem is that the DB is too large, so my thought was to execute:

mysqldump [connection-info-here] --no-autocommit --where="1 limit 1000" mydb > /dump.sql  

I think this will give me the first thousand rows of every table in database mydb. I anticipate that the resulting dataset will break a lot of foreign key constraints since some records will be missing. As a result the application I mean to run on the dev machine will fail.

Is there a way to mysqldump a sample of the database while ensuring that all records dumped abide by key constraints? (for instance if a foreign key is dumped, the matching record in the foreign table will also be dumped).

If that isn't possible, how do you guys deal with this problem?

Upvotes: 4

Views: 1221

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562328

No, there's no option for mysqldump to dump only rows that match in foreign key relationships. You already know about the --where option, and that won't do it.

I've had the same task as you, to dump a subset of data but only data that is related. For example, for creating a test instance.

I've been using MySQL for many years, I've worked as a MySQL consultant and trainer, and I try to keep up with current tools. I have never heard of any MySQL tool that does this operation.

The only solution I can suggest is to write your own script to dump table by table using SELECT...INTO OUTFILE.

It's sometimes easier to write a custom script just for your specific schema, than for someone to write a general-purpose tool that works for everyone's schema.

How I have dealt with this problem in the past is I don't copy data from the live database. I find some other way to create a subset of fake data for testing. It's probably better to create synthetic data anyway, because then you don't risk accidentally using live data in your dev/test environment, in case some of it is private data.

Upvotes: 2

Related Questions