sengineer23
sengineer23

Reputation: 33

Storing stock prices in MySQL Database

I am creating a website that does computations based on historical daily stock quotes. Right now I have 5 tables corresponding to different letters in the alphabet (organized by symbol).

This is an example of the create statements I am using to build the tables.

CREATE TABLE `EOD_QRSTU` (
  `symbol` varchar(10) NOT NULL,
  `open` decimal(10,5) DEFAULT NULL,
  `close` decimal(10,5) DEFAULT NULL,
  `high` decimal(10,5) DEFAULT NULL,
  `low` decimal(10,5) DEFAULT NULL,
  `volume` int(11) DEFAULT NULL,
  `date` datetime NOT NULL,
  PRIMARY KEY (`date`,`symbol`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The most common query is selecting the rows for a particular symbol between two dates. This table for example has about 500,000 rows right now and takes up to 8 seconds to return about 400 rows for a particular symbol within a two-year range. Why does it take so long? How do sites like Yahoo Finance load their charts so fast? I have been thinking about storing each stock's data in its own file in the file system instead, can anyone offer some insight into this approach? If it is fast enough this way I don't even need to limit the number of rows returned, I could simply return every line in the file. Would CSV be a suitable file format?

If it matters I am using PythonAnywhere to host the site, for now, and am using Python with Flask-SQLalchemy for the backend. I also plan on migrating to AWS soon, will that solve part of the issue?

Upvotes: 1

Views: 1806

Answers (2)

Rick James
Rick James

Reputation: 142218

I agree with danblack's Answer (PK column order, charset)

DECIMAL(10,5) is not big enough for BRK.A, currently at $412,802. INT is not big enough for the volume of several indexes.

You probably want NOT NULL on all columns.

A different filesystem probably won't help enough to matter.

Use InnoDB. Period. Full stop.

How much RAM do you have? What is the value of innodb_buffer_pool_size? Do you have HDD or ssd? It sounds like your table is much less than 1GB, so I expect things to be fully cached in RAM most of the time.

Upvotes: 1

danblack
danblack

Reputation: 14666

Reorder your primary key and put symbol before date.

Indexes work best when the fixed part of the index, the symbol is before the range searched part, date.

Examine this by looking at explain {query} and seeing how many bytes of the index is used and the query plan.

Also don't make symbol UTF8 if you only are using an ascii character set.

Upvotes: 2

Related Questions