Ozzy
Ozzy

Reputation: 10643

MySQL, Two billion rows of data, read only, performance optimisations?

I have a set of integer data. The first being the number 0 and the last being 47055833459. There are two billion of these numbers from the first to the last and they will never change or be added to. The only insert into the mysql table will be loading this data into it. From then on, it will only be read from.

I predict the size of the database table to be roughly 20Gb. I plan on having two columns:

id, data

Id will be a primary key, auto incremented unsigned INT and data will be an unsigned BIGINT

What will be the best way of optimising this data for read only with those two columns? I have looked at the other questions which are similar but they all take into account write speeds and ever increasing tables. The host I am using does not support MySQL partitioning so unfortunately this is not an option at the moment. If it turns out that partitioning is the only way, then I will reconsider a new host.

The table will only ever be accessed by the id column so there does not need to be an index on the data column.

So to summarise, what is the best way of handling a table with 2 billion rows with two columns, without partitioning, optimised for reads, in MySQL?

Upvotes: 3

Views: 2574

Answers (2)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

Assuming you are using InnnDB, you should simply:

CREATE TABLE T (
    ID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    DATA BIGINT UNSIGNED
);

This will effectively create one big B-Tree and nothing else, and retrieving a row by ID can be done in a single index seek1. Take a look at "Understanding InnoDB clustered indexes" for more info.


1 Without table heap access, in fact there is no heap at all.

Upvotes: 3

O. Jones
O. Jones

Reputation: 108676

Define your table like so.

CREATE TABLE `lkup` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `data` BIGINT UNSIGNED NOT NULL,
  PRIMARY KEY (`id`, `data`)
) 

The compound primary key will consume disk space, but will make lookups very fast; your queries can be satisfied just by reading the index (which is known as a covering index).

And, do OPTIMIZE TABLE lkup when you're finished loading your static data into it. That may take a while, but it will pay off big at runtime.

Upvotes: 0

Related Questions