Jesus is Lord
Jesus is Lord

Reputation: 15409

class/object to generate unique id's

I'm using C# but even if you don't know it, it should be pretty easy to follow along with this question.

Here's my problem: I have some objects that I'd like to keep in a hashset-like-data structure so that I can look them up based on an int ID. These objects have mutable properties, so hashing them is not an option (I would need something constant about them to hash, yes?).

What I've done is develop the following interface:

public interface IUniqueIDCollection
{
    // Can return any int that hasn't been requested yet.
    public int RequestUniqueID();

    // Undos the requesting of an int
    public int ReleaseUniqueID(int uniqueID);
}

My initial thought is to just store an internal counter in the IUniqueIDCollection that increments as ID's are requested. However once ID's are released, I would have to keep track of ranges or individual ID's that have been removed. I think the latter would be better. But if I used a counter (or any cyclic function) to generate the ID's, I would have the problem of having to go through checking sequences of ID's that have been successively requested by not released once the counter wraps around.

The heuristics are this: Let's say a maximum of 5,000 ID's will be requested at once. HOWEVER, very often ID's will requested and then released. Releasing will tend to happen in ranges -- i.e. maybe 100 will be requested all at once, and then all 100 will be released in a short time interval.

I know I could use a GUID or something instead of an int, but I'd like to save space/bandwidth/processing time of the ID's.

So my question is: What should the request and release methods look like in the interface I gave above, in terms of pseudo code, given the heuristics?

Upvotes: 1

Views: 129

Answers (2)

Erik P.
Erik P.

Reputation: 1627

Probably a worse idea than Tom Panning's above in almost all cases, but you could use a BitArray to keep track of IDs that are in use. The memory usage is as many bits as you would ever have live IDs in total; worst case would be 512MB for mapping out all 32-bit ints. Releasing is easy: just set the corresponding bit to 0. Acquiring (or requesting) an ID requires searching for a 0 bit, and if you don't find it, extending the BitArray.

If you still have the option of extending your BitArray (i.e. you're not at 512MB yet), you would probably not want to search all of the BitArray before deciding to extend - doing that all the time would be slow. You certainly wouldn't always want to start at the same index: it might be a good idea to keep track of the last 0 that you found and start searching from there.

The one advantage that I can see is memory usage once all, or almost all, of the objects are released. Then Tom Panning's solution requires at least 32 times as much memory as this one. However, I'd expect that in typical usage that solution uses less.

Upvotes: 1

Tom Panning
Tom Panning

Reputation: 4772

If you're sure that released ID's are safe to be reused immediately (i.e., there won't be stale references to old ID's hanging around that would be confused if a new object was assigned a recently-released ID), you can use the released ID's first. So when an ID is released, you put it at the end of a queue. When a new ID is requested, you use the first one in the queue. If the queue is empty, you increment the internal counter and give out the new number.

Advantage of this implementation:

  • All operations are O(1). You're never iterating over a collection or range. You only ever insert at the end of the queue, remove from the front of the queue, or increment your counter.
  • The memory footprint should be fairly low because you're trying to use up the queue as quickly as possible.
  • The implementation is straightforward.

Disadvantages:

  • You'll be reusing ID's quickly, so you won't be using your whole index range to keep new objects from using the same ID as recently-released objects.
  • You won't be able to even guess at the age of an object by looking at its ID.

Upvotes: 5

Related Questions