Does string size affects select query speed in MySQL?

Question

I'm currently designing the structure of my database. Planning to have a huge table (let's call it Clicks table) with hundreds of millions of rows. Many of its columns will be referenced with a foreign key in other tables to reduce the size of this huge table and to reduce query time.

In those other 'reference tables' I'm planning to store most of the data about clicks. So when I look up the Clicks table I'll just simply JOIN a few of these reference tables to get what I want to know about clicks.

1st question: is this a good practice to do in terms of speed - if I'm going to do lots of selects on this huge Clicks table later on?

These smaller reference tables will have a few thousands of rows mostly with 1 column with a type of string. These strings will be somewhere between 5-50 characters long.

What I'm planning to do is when there's a click, I'll check these small tables if the same value exists already or not, and if not, then I'll insert them.

This would require a SELECT.

2nd question: Is it better to perform searches on the string itself and index it, or shall I have another column with the MD5 result of the string and look up the MD5 string (with index) instead? In other words, does the string's size affects the length of looking up a string in a simple select?

I'm planning to do SELECTs like this:

SELECT id FROM table1 WHERE string = $string

Is there any better way to achieve any of the above?

Gordon Linoff · Accepted Answer

Your design sounds fine. You want a secondary index on the string in each of the reference tables.

Your description is unclear whether you are doing this one "click" at a time or in batch.

I would recommend a batch approach for this operation, unless you have a compelling need for real-time data. If you do need real-time data, I would tend to advocate as "streaming-ish" approach, where new data is added by inserting onto existing tables and never with updates.

If you are updating millions of rows each day, individually, then the locking operations during peak processing could get costly. If the table is being used for analytics or reporting, the query loads from that processing could also interfere with the updates.

Does string size affects select query speed in MySQL?

Answers (2)

Related Questions