Rick_C137
Rick_C137

Reputation: 142

Mysql take dump of some portion like 10-20 % of whole database

I know that to take database dump I can ignore some tables. but I want to take the dump of all table with some percentage of data like 20% 40% because the whole dump is too big. here is my normal dump query.

mysqldump -h dburl -u user -p password --databases dbname > dbname.sql

I am not looking for specific OS and using Linux Ubuntu.

Upvotes: 10

Views: 1094

Answers (3)

harvey
harvey

Reputation: 2953

It sounds like you want to avoid making a script, one quick solution is to use the --where option for mysqldump.

mysqldump --opt --where="1 limit 1000" myschema

This will limit dumps to 1000 rows - obviously adjust to your size restrictions.

You can follow this up with an offset dump to get the next 1000 - a small adjustment is needed so the table is not recreated.

mysqldump --opt --where="1 limit 1000 offset 1000" --no-create-info myschema

You can mix this up further, say you want only 40% of all data, from randomly selected rows:

mysqldump --opt --where="1 having rand() < 0.40" myschema

Upvotes: 2

Rick James
Rick James

Reputation: 142298

The 80-20 rule says that the smallest 80% of the tables will probably consume only 20% of the space. So have one mysqldump for them.

Then have more mysqldump(s) for each remaining table smaller than 20% of the space.

Finally, any big tables need the --where option mentioned by Nambu14. Or you could try the kludge of saying --where="true LIMIT 20000,10000" to sneak an OFFSET and LIMIT in. (See one of the comments on https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html ) But do not allow writes to the table while doing that -- it could lead to extra/missing records.

Or you could adapt chunking techniques as discussed here . This avoids the extra/missing problem and avoids the LIMIT kludge. With luck, you can hard code the range values needed for ranges like this --where="my_pk >= 'def' AND my_pk < 'mno'"

Don't forget to deal with Triggers, Stored routine, Views, etc.

Upvotes: 4

Nambu14
Nambu14

Reputation: 380

There's a similar question open. With the --where option you can limit the amount of records included in the mysqldump (official documentation here), but this option applies for every table in the database.

Another way is to give the command a sql script to run and prepare the data in that script, this will work as a pseudo ETL pipeline.

Upvotes: 2

Related Questions