Reputation: 363
I would like to store stock trading data for 1000 symbols. The data is actually converted from text files so there is no need for inserts and updates; only read-only access will be required.
The data is basically grouped like this: each symbol has many records: {timestamp, price, quantity}
, each record represents a trade.
An approximate upperbound of data for one symbol is 5 records/second, 8 hours for each working day, i.e. 5x60x60x8 = 144K per day. I.e. 1K symbols would generate 144M records per day.
Most of operations over the data would be something like:
Now the question: what would be the best design for a database in this case?
The database may be either MS SQL or MySQL. The total time period - up to 5 years. Thank you!
Upvotes: 6
Views: 2839
Reputation: 1622
Third option is the best 1. You need high Read performance with almost negligible writes.
You requirements are best suited for NoSql databases. Single table with no relationships; MySQL would be overkill. More info --> NoSql Databases
Upvotes: 4
Reputation: 371
That's a whole lot of data. Do look at NoSQl.
Using SQL, here are some basic ideas:
Put all price data in a table, using as small data types as possible. Use a SymbolId (int) to reference the symbol, the smallest datetime type needed, the smallest monetary type needed.
Do denormalize. Make a second table with min/max/avg per day and SymbolId.
Research horizontal partitioning and use indexes.
Upvotes: 4
Reputation: 6240
Since you'll be running queries from one datetime to another I wouldn't split tables up at all. Instead, learn more about sharding. Below is the schema I would use:
symbols
id varchar(6) // MSFT, GOOG, etc.
name varchar(50) // Microsoft, Google, etc.
...
trades
id unsigned bigint(P)
symbol_id varchar(6)(F symbols.id)
qwhen datetime
price double
quantity double
...
Upvotes: 3