Reputation: 142
I know that to take database dump I can ignore some tables. but I want to take the dump of all table with some percentage of data like 20% 40% because the whole dump is too big. here is my normal dump query.
mysqldump -h dburl -u user -p password --databases dbname > dbname.sql
I am not looking for specific OS and using Linux Ubuntu.
Upvotes: 10
Views: 1094
Reputation: 2953
It sounds like you want to avoid making a script, one quick solution is to use the --where
option for mysqldump.
mysqldump --opt --where="1 limit 1000" myschema
This will limit dumps to 1000 rows - obviously adjust to your size restrictions.
You can follow this up with an offset dump to get the next 1000 - a small adjustment is needed so the table is not recreated.
mysqldump --opt --where="1 limit 1000 offset 1000" --no-create-info myschema
You can mix this up further, say you want only 40% of all data, from randomly selected rows:
mysqldump --opt --where="1 having rand() < 0.40" myschema
Upvotes: 2
Reputation: 142298
The 80-20 rule says that the smallest 80% of the tables will probably consume only 20% of the space. So have one mysqldump for them.
Then have more mysqldump(s) for each remaining table smaller than 20% of the space.
Finally, any big tables need the --where
option mentioned by Nambu14. Or you could try the kludge of saying --where="true LIMIT 20000,10000"
to sneak an OFFSET
and LIMIT
in. (See one of the comments on https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html ) But do not allow writes to the table while doing that -- it could lead to extra/missing records.
Or you could adapt chunking techniques as discussed here . This avoids the extra/missing problem and avoids the LIMIT
kludge. With luck, you can hard code the range values needed for ranges like this --where="my_pk >= 'def' AND my_pk < 'mno'"
Don't forget to deal with Triggers, Stored routine, Views, etc.
Upvotes: 4
Reputation: 380
There's a similar question open. With the --where option you can limit the amount of records included in the mysqldump (official documentation here), but this option applies for every table in the database.
Another way is to give the command a sql script to run and prepare the data in that script, this will work as a pseudo ETL pipeline.
Upvotes: 2