dvir
dvir

Reputation: 5115

Best approach for having unique row IDs in the whole database rather than just in one table?

I'm designing a database for a project of mine, and in the project I have many different kinds of objects. Every object might have comments on it - which it pulls from the same comments table.

I noticed I might run into problems when two different kind of objects have the same id, and when pulling from the comments table they will pull each other comments. I could just solve it by adding an object_type column, but it will be harder to maintain when querying, etc.

What is the best approach to have unique row IDs across my whole database?

I noticed Facebook number their objects with a really, really large number IDs, and probably determine the type of it by id mod trillion or some other really big number. Though that might work, are there any more options to achieve the same thing, or relying on big enough number ranges should be fine?

Thanks!

Upvotes: 0

Views: 658

Answers (4)

Peter Henniger
Peter Henniger

Reputation: 1

I am using tables as object classes, rows as objects and columns as object parameters. Everything starts with the class techname, in which every object has its unique identifier, which is unique in the database. The object classes are registered as objects in the table object classes, and the parameters for each object class are linked to it.

Upvotes: 0

candu
candu

Reputation: 2883

If you only have one database instance, you can create a new table to allocate IDs:

CREATE TABLE id_gen (
  id BIGINT PRIMARY KEY AUTO_INCREMENT NOT NULL
);

Now you can easily generate new unique IDs and use them to store your rows:

INSERT INTO id_gen () VALUES ();
INSERT INTO foo (id, x) VALUES (LAST_INSERT_ID(), 42);

Of course, the moment you have to shard this, you're in a bit of trouble. You could set aside a single database instance that manages this table, but then you have a single point of failure for all writes and a significant I/O bottleneck (that only grows worse if you ever have to deal with geographically disparate datacenters).

Instagram has a wonderful blog post on their ID generation scheme, which leverages PostgreSQL's awesomeness and some knowledge about their particular application to generate unique IDs across shards.

Another approach is to use UUIDs, which are extremely unlikely to exhibit collisions. You get global uniqueness for "free", with some tradeoffs:

  • slightly larger size: a BIGINT is 8 bytes, while a UUID is 16 bytes;
  • indexing pains: INSERT is slower for unsorted keys. (UUIDs are actually preferable to hashes, as they contain a timestamp-ordered segment.)

Yet another approach (which was mentioned previously) is to use a scalable ID generation service such as Snowflake. (Of course, this involves installing, integrating, and maintaining said service; the feasibility of doing that is highly project-specific.)

Upvotes: 0

Gerard Yin
Gerard Yin

Reputation: 1313

Why not tweaking your concept of object_type by integrating it in the id column? For example, an ID would be a concatenation of the object type, a separator and a unique ID within the column.

This approach might scale better, as a unique ID generator for the whole database might lead to a performance bottleneck.

Upvotes: 0

Chris Henry
Chris Henry

Reputation: 12010

You could use something like what Twitter uses for their unique IDs.

http://engineering.twitter.com/2010/06/announcing-snowflake.html

For every object you create, you will have to make some sort of API call to this service, though.

Upvotes: 3

Related Questions