Reputation: 1
We have started a new project using MySQL, spring boot, and Angular js. Initially, we did not realize our DB is going to handle large data.
The number of tables will not be large (<130), only 10 to 20 tables will be contained in more data, which is almost inserted/ read/ update.
The estimated amount of data in that 10 table is going to grow at 12,00,000 records in a month, and we should not delete those data be able to do various reports.
There needs to be (read-only) replicated database as a backup/failover, and maybe for offloading reports in peak time.
I don't have first-hand experience with that large databases, so I'm asking the ones that have which DB is the best choice in this situation. as we have completed 100% coding and development but now we realize this. I have doubts may be MYSQL going to handle large data. I know that Oracle is the safe bet, interested if Mysql with a similar setup. But it is bound only in MySQL I am ok with any DB based on you all feedback I can take a call.
Open source DB more preferable but it's not mandatory we can go for paid DB also.
Upvotes: 0
Views: 339
Reputation: 335
I agree with MontyPython, MySql can do it and the design is critical. Fortunately MySql allows you to be flexible over time as needed.
I've had history tables needed used in daily reporting that grew to over a billion records in plain MySql and had no problems.
I've also used MySql Merge tables to divide up tables with big-ish rows (100KB+) to speed things up. Basically keeping the individual merge table file sizes under 30GB each. However that solution increases the open file count (in the system) per client - might be a bigger deal on a clustered system. That one was not.
That said, I like to give Honorable Mention to:
MariaDB - MySql but with contributions from Facebook, Alibaba, Google, and more
.
I've moved most of my MySql community edition projects over to MariaDB and have been very happy. It's an almost transparent upgrade.
They offer an interesting enterprise Big Data Analytics (MariaDB AX) package, but with your current requirements its probably overkill and the standard community edition will fulfill your needs.
For example, here's an informative tutorial on how to set up a scalable Cluster (Galera) and adding MaxScale for High Availability:
https://mariadb.com/resources/blog/getting-started-mariadb-galera-and-mariadb-maxscale-centos
Another interesting option is Vitesse - developed at Youtube, which allows for sharded mysql through a (mostly) driver based solution. It solves the problem of needing to have available access to huge amounts of data and always yield good performance. As such, it goes beyond high availability and focuses on a solution wherein no single query (ie. a report against millions of rows of historical data) can negatively impact the other queries needing to be performed.
Upvotes: 0
Reputation: 2994
MySQL is more than capable of handling such loads. In fact, it is capable of handling much much more load than what you are talking about. You just have to create the right kind of tables. You can do that by choosing
EDIT: You've also got to choose the right kind of data modelling and normalization strategy for your use-case. Most of OLTP applications require some level of normalization. But if you want to do analytics and aggregates on heavy tables, you should either have a Data Warehouse of have highly denormalized tables to avoid joins and/or
have a column-oriented database to support such queries.
MySQL is open-source and has a very strong community support so you will find a lot of literature around any issue that you face. You can also find all the filed bugs (resolved and unresolved) here.
As far as the number of tables are concerned, there's really no cap on that. See here, MySQL permits 4 billion tables if you're using InnoDB as the engine.
A lot of very big companies with scale use MySQL in some capacity. Facebook is one of them.
With the growing popularity of JSON as the de facto data exchange format across the internet, MySQL has also provided native JSON support in 5.7, so now you can store and query JSON from your APIs, if required.
MySQL Replication works! Earlier, MySQL used to support coordinate replication only but now it supports GTID replication which makes it easier to maintain and fix replication issues. There are third-party replicators also available in the market. For instance, Continuent's Tungsten is a replicator written in Java and is a replacement for native replication. It comes with a lot of configuration options which are not available with native MySQL replication.
Upvotes: 1