juanba1984
juanba1984

Reputation: 357

Partition tables in MySQL

We have a MySQL table (table_ha) like this one:

Name = table_ha
+----------+------------------+
| hash_loc |     hash_val     |
+----------+------------------+
|  242342  | 9606075000001005 |
+----------+------------------+
|  431231  | 9606075000005208 |
+----------+------------------+
|  342344  | 7645345456745536 |
+----------+------------------+
|  324254  | 7656453453465788 |
+----------+------------------+
|  656456  | 9788674534546766 |
+----------+------------------+
|  674453  | 3458752778456834 |
+----------+------------------+
|   ...    |       ...        |
+----------+------------------+
|  765874  | 8796634586346785 |
+----------+------------------+
|  864534  | 9834667054534588 |
+----------+------------------+

We continuously execute queries like the following one:

SELECT * FROM table_ha (SELECT 1 AS hash_loc UNION ALL SELECT 28700 UNION ALL SELECT 28728 ... UNION ALL SELECT 28680 UNION ALL SELECT 28694) AS T1 ON table_ha.hash_loc = T1.hash_loc'

We must assume that we might have thousands of numbers in the query (enclosed in UNION ALL SELECT X). When the number of rows in table_ha is not high, it works ok. Now, imagine having thousands of millions of rows. Then it becomes very slow.

Do you know if partitioning can work in a case like this? How can that be applied to the present table? Do you now any other alternative to this?

NOTE: hashloc is a BigInt(32) and hash_val is a BigInt(64)

Upvotes: 1

Views: 191

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562631

I don't think partitioning is needed in this case. I would recommend making sure you have an index on table_ha.hash_loc.

I'm not sure why you use the subquery with UNION instead of just using an IN() predicate:

SELECT * FROM table_ha 
WHERE hash_loc IN (1, 28700, 28728 ... 28680, 28694);

By the way, BIGINT(32) is the same as BIGINT(64). See my answer to Types in MySQL: BigInt(20) vs Int(20)


Re your comment:

Partitioning only helps when you're searching on the column(s) used for the partitioning key. And you can only partition a given table one way. Indexes are more generally useful, because you can create multiple indexes per table.

I work with tables of hundreds of millions of rows, and the indexes help a lot. But indexes have to be designed carefully to match each specific query you want to optimize.

You might like my presentation How to Design Indexes, Really. Also there's a video of me presenting it: https://www.youtube.com/watch?v=ELR7-RdU9XU

Upvotes: 1

Related Questions