Reputation: 8071
We have a mysql table called posts_content.
The structure is as follows :
CREATE TABLE IF NOT EXISTS `posts_content` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`post_id` int(11) NOT NULL,
`forum_id` int(11) NOT NULL,
`content` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=79850 ;
The problem is that the table is getting pretty huge. Many giga-bytes of data ( we have a crawling engine ).
We keep inserting data into the table on a daily bases but seldom do we retrieve the data. Now as the table is getting pretty huge its getting difficult to handle the table.
We discussed two possibilities
I hope I have clearly explained the problem. WHat I need to know is which of the above two would be a better solution in the long run. What are the adv. dis adv. of both the cases.
Thanking you
Upvotes: 6
Views: 4344
Reputation: 311
Although this is an old post, caveat with regards to partitioning if your engine is still MyISAM
. MySQL 8.0 no longer supports partitioning other than Innodb or NDB storage engines only. In that case, you have to convert your MyISAM
table to InnoDB or NDB but you need to remove partitioning first before converting it, else it cannot be used afterwards.
Upvotes: 1
Reputation: 142208
A quick solution for 3x space shrinkage (and probably a speedup) is to compress the content
and put it into a MEDIUMBLOB
. Do the compression in the client, not the server; this saves on bandwidth and allows you to distribute the computation among the many client servers you have (or will have).
"Sharding" is separating the data across multiple servers. See MariaDB and Spider. This allows for size growth and possibly performance scaling. If you end up sharding, the forum_id may be the best. But that assumes no forum is too big to fit on one server.
"Partitioning" splits up the data, but only within a single server; it does not appear that there is any advantage for your use case. Partitioning by forum_id will not provide any performance.
Remove the FOREIGN KEYs
; debug your application instead.
Upvotes: 0
Reputation: 2863
here you have a good answer for your question: https://dba.stackexchange.com/a/24705/15243
Basically, let your system grow and while you get familiarized with partitioning, and when your system really need to be "cropped in pieces", do it with partitioning.
Upvotes: 0
Reputation: 26699
The difference is that in the first case you leave MySQL to do the sharding, and in the second case you are doing it on your own. MySQL won't scan any shards that do not contain the data, however if you have a query WHERE forum_id IN(...)
it may need to scan several shards. As far as I remember, in that case the operation is syncronous, e.g. MySQL queries one partition at a time, and you may want to implement it asyncronously. Generally, if you do the partitioning on your own, you are more flexible, but for simple partitioning, based on the forum_id, if you query only 1 forum_id at a time, MySQL partitioning is OK.
My advice is to read the MySQL documentation on partitioning, especially the restrictions and limitations section, and then decide.
Upvotes: 4