Most efficient way to enforce uniqueness for varchar in large table

Question

I have a MySQL DB that should contain around 50-100 million records.

My table fields are: TABLE_ID (INT), TABLE_STRING (VARCHAR 255), TABLE_BOOL (INT)

TABLE_ID is my primary key, but I need to make sure that TABLE_STRING is kept unique for all rows in the table.

Should I set TABLE_STRING as index and use my script to use a SELECT to check if the string already exist or should set TABLE_STRING as unique in my table settings?

Is there any other way of achieving this? If not which one of the above is better?

Thanks,

Nikolaos Dimopoulos · Accepted Answer

If you do not have an issue with space, then you could easily add a UNIQUE index for the TABLE_STRING field.

This will create a unique index for that field which will alert you if you are trying to add duplicate data (insert/update operations).

Since you have a large volume of rows and the fact that the field is VARCHAR(255) expect the index file to be quite large.

If however you do have concerns about space, then you might want to drop the TABLE_ID and use the TABLE_STRING as your primary key (which will be unique).

I would personally just add the unique index on the TABLE_STRING field.

Please note that the operation of adding the index might initially fail if you already have rows that have duplicate content for that field. A good practice is to first check if there are any duplicate data in that field.

HTH

Most efficient way to enforce uniqueness for varchar in large table

Answers (2)

Related Questions