Reputation: 4006
I am trying to develop an time card application. So for Every month there will be 30 or 31 days. (with the fields AM-IN, AM-OUT, PM-IN, PM-OUT, Etc) which is a VARCHAR(4500) in mysql
What is my idea is to store this one month data (30 days) in to one row in the database. I am storing 30 days data in XML format. so while fetching only one row is selected.
Everything is perfect. It is working great.
The client is expecting 1 million users to use this time card. Now the issue comes when i created stress data. I created stress data for 1 million users for 3 years. Exactly ( 1 million * 12 months *3) number rows created. The application is working fine. But when i see the disk usage this table consumes 50 GB. I am sure this 50 GB consumption is because of VARCHAR(4500). If i break it up in to seperate columns this issue wont be there.
Here is my question. If i break the time card VARCHAR(4500) in to seperate fields i will be storing rows for each day. So the number of rows stored will be ( 1 million * 12 months * 30 days *3)
In the case of real time (10,000 users accessing parallely this time card page) Will tomcat + mysql can handle 10,000 parallel requests ( i mean fetching 30 records per hit) ?
Which DATA MODAL to use
1) Storing 1 month data in a single row
or
2) Storing 1 month data in 30 rows?
Upvotes: 1
Views: 164
Reputation: 28197
IMHO I would go with your 2nd data model. (A row per day of data) Breaking up the data into separate columns makes more sense and will allow you to do better data validation, indexing, efficiency, etc. This looks like a good situation to use partitioning based on a date, where you can roll partitions off the back of your main data table and store them either in lower cost storage or export them to a file as Italy suggested. This should keep your table at a manageable size and allow for better query performance. I recommend reading up on the different storage engine options you have for MySQL as their implementation aspects can change the performance greatly depending on the throughput you need.
Upvotes: 1
Reputation: 31223
In the case of real time (10,000 users accessing parallely this time card page) Will tomcat + mysql can handle 10,000 parallel requests ( i mean fetching 30 records per hit) ?
No the perfomance is depends on the level of caching, if every user access different card each time (totally random) and you have 50G of DB so your would be disk bounded, and no, you would not be able to fetch 10K records in one second from different places in the disk in any case.
On the other hand of 99.9% of users access only recent entries, so 50/12/3 ~= 1.5G frequently access data, so it is stored in memory cache, you may have a chance to be able to fetch 10K requests per user on the computer with lots of memory and CPUs, but I don't think that you can do this in parallel requests, because MySQL has thread per connection.
In any case, you would probably need to prepare partition the DB to several servers so you have an ability to scale up and implement efficient entries caching in memory.
EDIT in any case, if you try to store key/value data only, without additional indexing, I would suggest to pic something simpler then full relational database, take a look on http://memcachedb.org/, or separate archive storage and storage that can be updated -- because storage that is not updated may be stored differently.
Upvotes: 0
Reputation: 53597
Which DATA MODAL to use 1) Storing 1 month data in a single row or 2) Storing 1 month data in 30 rows?
Store the current month in the way that is the fastest.
Archive the months before in the way that makes most sense (Heck, I would even export the data to a file, this is purely read only data).
Upvotes: 0