Reputation: 327

What is the performance hit of using a string type vs a uuid type for a UUID primary key?

Is there much of a speed difference for index lookups by using string for the primary key versus the actual uuid type, specifically if the string has a prefix like user-94a942de-05d3-481c-9e0c-da319eb69206 (making the lookup have to traverse 5-6 characters before getting to something unique)?

Upvotes: 17

Answers (3)

Schwern

Reputation: 164829

This is a micro-optimization and is unlikely to cause a real performance problem until you get to enormous scales. Use the key that best fits your design. That said, here's the details...

UUID is a built in PostgreSQL type. It's basically a 128 bit integer. It should perform as an index just as well as any other large integer. As of Postgres 13 it offers gen_random_uuid() to make UUIDv4 (basically just a random number). For more functions you can load the uuid-ossp extension, or you can generate a UUID in the client. Generating the UUID on the client distributes the extra work (not much extra work) away from the server.

MySQL does not have a built in UUID type. Instead there's UUID() which can generate a UUIDv1 as a string of hex numbers. Because it's a string, UUID keys may have a performance and storage hit. It may also interfere with replication. You can use UUID_TO_BIN() to turn that into a varbinary(16); effectively a 128 bit integer.

The string UUID will be longer; hex characters only encode 4 bits of data per byte so a hex string UUID needs 256 bits to store 128 bits of information. This means more storage and memory per column which can impact performance.

Normally this would mean comparisons are twice as long, since the key being compared is twice as long. However, UUIDs are normally unique in the first few bytes, so the whole UUID does not need to be compared to know they're different. Long story short: comparing string vs binary UUIDs shouldn't cause a noticeable performance difference in a real application... though the fact that MySQL UUIDs are UTF8 encoded might add cost.

Using UUIDs on PostgreSQL is fine, it's a built-in type.

MySQL's implementation of UUID keys means you have to do a lot of conversion between strings and varbinary, which welcomes bugs, and by using UUIDv1 which has some mild security implications.

Upvotes: 28

saolof

Reputation: 1641

Use the built-in UUID type that maps to a 128-bit int. Not just for performance, but to prevent strings like "password1" from showing up in that column.

Upvotes: 1

Rick James

Reputation: 142298

The real problem with UUIDs comes when the table (or at least the index) is too big to be cached in RAM. When this happens, the 'next' uuid needs to be stored into (or fetch from) some random block that is unlikely to be cached. This leads to more and more I/O as the table grows.

AUTO_INCREMENT ids usually don't suffer that I/O growth because INSERTs always go at the 'end' of the table and SELECTs usually cluster near the end. This leads to effective use of the cache, thereby avoiding the death-by-IO.

My UUID blog discusses how to make "Type-1" UUIDs less costly to performance, at least for MySQL.

Upvotes: 7

What is the performance hit of using a string type vs a uuid type for a UUID primary key?

Answers (3)

Related Questions