Reputation: 196
I have a table that is roughly 290,000 rows long. Before backup, it probably took <200 MB. When I created a backup of this table using mysqldump
, the backup file takes ~800 MB, and when I reload from the backup file using mysql
, I now see that it has ~430,000 rows, way more than the original table (I am checking via HeidiSQL UI). But if I do a query on the total range of the primary key, it is the same as the old table (~290,000). What could have possibly gone wrong?
Here is the CREATE code for the particular table in concern. It is just a list of variables (of DECIMAL type)
CREATE TABLE `ciceroout` (
`runID` INT(11) NOT NULL AUTO_INCREMENT,
`IterationNum` DECIMAL(20,10) NULL DEFAULT NULL,
`IterationCount` DECIMAL(20,10) NULL DEFAULT NULL,
`RunningCounter` DECIMAL(20,10) NULL DEFAULT NULL,
\* more 100 variables like this *\
PRIMARY KEY (`runID`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=287705
;
EDIT: Here is the actual dump and restore commands I used. Our database has six tables, and I already dumped one table so here I am dumping the remaining five tables.
dump tables :
mysqldump -u root --single-transaction=true --verbose -p [dbname] --ignore-table=[dbname].images > \path\[backupname].sql
restore tables (after dropping the original database, and starting an empty one):
mysql -u root -p [db name] < \path\[backupname].sql
and here is what I see on HeidiSQL UI
Upvotes: 3
Views: 1218
Reputation: 20550
If you wonder about the big export file: thats normal.
The export is stored in a human readable format (SQL), while the actual data on the tablespace is stored in a much more effficient data structures (B+Tree and others)
Regarding the table statistics HeidiSQL is showing you:
For InnoDB, the "number of rows" statistic is just an approximation.
The result of COUNT(*)
gives you the real amount of rows, which is matching the original, right?
The approximation will change over time and get better as you start working with the data.
The MySQL manual page for SHOW TABLE STATUS states:
The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40 to 50%. In such cases, use SELECT COUNT(*) to obtain an accurate count.
Upvotes: 3
Reputation: 142503
Let's say you are dumping an INT
, which is a 4-byte quantity in the database.
Value = 1 -- dump contains ...,1,... -- effectively 2 bytes.
value = -1222333444 -- dump contains ...,-1222333444,... -- 12 bytes
With those examples, you see that an INT
can take between half as much space and 3-times as much space when dumped. (Other datatypes lead to other examples.)
The "280K rows" is exact and won't change until you INSERT
/DELETE
rows. The "430K" is, as already mentioned, an approximation.
The actual disk space may have increased or decreased slightly after dumping and loading. This is due to a large number of factors.
We just have to live with these non-very-important inconsistencies.
SHOW TABLE STATUS
is another way to see the disk space.
I think of "counters" as being whole numbers. Is there some reason for having 10 decimal places on this:
RunningCounter` DECIMAL(20,10)
Changing all of those to INT
would shrink each column from 10 bytes to 4 bytes. This would cut the disk utilization in half.
Upvotes: 2