Vijay Boyapati
Vijay Boyapati

Reputation: 7708

Ensuring uniqueness on a varchar greater than 255 in MYSQL/InnoDB

I have a table which contains URLs, some of which are longer than 255 characters. I want to impose a uniqueness constraint on the URL column, but MySQL isn't letting me create a key on the URL. I'm using an InnoDB/UTF8 table. From what I understand it's using multiple bytes per character with a limit of 766 bytes for the key (in InnoDB).

What is an elegant way of keeping the rows unique based on URL?

Upvotes: 4

Views: 2299

Answers (3)

Andre Dalcher
Andre Dalcher

Reputation: 128

In MySQL 5.7.6 or later, you can do this using a generated column containing a hash of the column that you want to place the uniqueness constraint on.

Generated columns are VIRTUAL by default, meaning that the values are never stored on disk. Virtual columns can only be indexed in versions of MySQL >= 5.7.8 using the InnoDB engine; for MyISAM or for older versions of MySQL you would have to use a STORED column.

mysql> CREATE TABLE url_hash_test (
    ->   url TEXT,
    ->   url_hash VARCHAR(32) AS (md5(url)),
    ->   UNIQUE(url_hash)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO url_hash_test SET url='http://example.com';
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO url_hash_test SET url='http://example.com';
ERROR 1062 (23000): Duplicate entry 'a9b9f04336ce0181a08e774e01113b31' for key 'url_hash'

Upvotes: 2

unutbu
unutbu

Reputation: 880269

For MySQL 5.7 or newer, see Andre Dalcher's answer for a nicer way using generated columns.


You could use a SHA1 hash of the url as the unique key. There is a chance that two urls have the same hash, but the probability of that is so ridiculously small, that for practical purposes this method should work fine.


You could also set up a trigger so the hash column is computed automatically whenever you INSERT:

CREATE TRIGGER mytrigger
BEFORE INSERT
ON foo
FOR EACH ROW SET
    NEW.hash = SHA1(NEW.url)

Upvotes: 5

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115600

You could split the URLs into 2 or more columns and make the combination (URLpart1, URLpart2, ..., URLpartN) unique. The maximum allowed index length will then be raised from 767 to 3072 bytes. Example:

CREATE TABLE atest
( id INT NOT NULL AUTO_INCREMENT 
, a VARCHAR(255) NOT NULL
, b VARCHAR(255) NOT NULL DEFAULT ''
, c VARCHAR(255) NOT NULL DEFAULT ''
, d VARCHAR(255) NOT NULL DEFAULT ''
, PRIMARY KEY (id)
, UNIQUE INDEX url_idx (a,b,c,d)
) ENGINE = InnoDB ;

Upvotes: 0

Related Questions