kuzey beytar
kuzey beytar

Reputation: 3227

Store huge data in MySQL?

I am trying to create a DB table in MySQL to store my product statistics data. Almost every days' stats should be in the database. Problem is the speed.

Currently I am storing, for each product, these values: TIME, SOLD ITEM, PRODUCT_ID, HIT, OTHER_ID

I thought two different ways to store datas:

or your ideas?

Speed tests that I did not so bad, almost good. But you have better ideas or experiences for this issue?

Upvotes: 2

Views: 716

Answers (3)

Jon Black
Jon Black

Reputation: 16559

approach the problem as a data-warehousing / data-mart solution (star/snowflake schema) with summary (aggregate/materialised view) like tables to reduce complex long running queries to faster simple select statements.

recommend bulk loading data into a staging (temporary) schema, cleansing, validating and mapping it before populating your fact and dimension tables :)

Upvotes: 1

lbz
lbz

Reputation: 9728

I am assuming you are using that database only for statics and its not the same "live" database where transactions are stored in first place.

Speed problem can occur:

  • when you insert the data into the database
  • when you query the database (i.e. from a web application)

Having your database dedicated to statistics, its sane to start designing the reports you want to produce; in this way you can define:

  • the data you have to insert into the database
  • the queries you are going to execute against the database

Sketch out the report in Excel (but you can really use any tool) and filling the report with fake data is a good start to have an idea about what you want implemented.

When you are happy with the fake result you can identify the data you need to squeeze into the database, the query you have to implement and the interaction with the report you want to give to your users, if there is any.

How to fill your database with data

  1. In first place you probably have a large and detailed amout of data, like a row describing a purchase. Start to find the dimensions that are actually useful in your report; a dimension is a measure that you care about, like what you have sold, when, who originally sold it.
  2. For every dimension find the smallest level of detail you want to use in your report: do you care about the hour of the purchase, or just the year? do you care about the category of the product sold or just its SKU?

This will tell you the data you have to transfer from your original database to the stats one.

How to keep your data up to date

This strongly depends on how frequently you want your stats to be updated. You can setuo a trigger that updates your stats db in real time or having a script running periodically to upgrade your stats db.

Notes

  1. Whenever the original database is changed in its schema, or more subtly in the way it stores data, you must take in account how those changed impact on your update procedure (triggers or external script)
  2. If your stats have some interaction (eg. from a web application), I'd suggest to use Data Cubes to define your stats db.
  3. Keep in mind you can't sort, select, or group serialized data easly.

Upvotes: 1

Bob Palmer
Bob Palmer

Reputation: 4762

Really depends on your reporting needs - i.e. if you are only reporting by product/day, then rolling the transactional stats into a summary table as part of a batch process makes sense.

In any case, I'd recommend separating out your transactional data and your reporting data into a separate database, that way you can optimize your transactional data for writes, and optimize your reporting database for reads (and do large reports without crushing your transactional processing capabilities).

Upvotes: 6

Related Questions