devoured elysium
devoured elysium

Reputation: 105227

Ordered Data Structure that allows to efficiently remove duplicate items

I need a data structure that

I can't really pick what the best data structure could be in here. I thought at first about something like a List(the problem is having an O(n) operation when removing items), but maybe I'm missing something? What about trees/heaps? Hashtables/maps?

I'll have to assume I'll do as much adding as removing with this data structure.

Thanks

Upvotes: 0

Views: 699

Answers (2)

DaveC
DaveC

Reputation: 2050

I think you may have to write a dedicated data structure (depending on your efficiency requirements).

Something like a doubly linked list with an extra nextEqualItemPtr in it and a HashMap pointing to the first of each item.

Then you can quickly find the first "b" to remove and follow all the nextEqualItemPtrs to remove them all (double linked so easy to keep list intact). Overhead is keeping the map up to date really. The nextEqualItemPtr list of a new item can just point to the node returned by map.put(key).nextEqualItemPtr

I would definitely use something simple first, and only plug this kind of thing in if/when it is too slow.

Upvotes: 2

Jack
Jack

Reputation: 133639

the Bag interface of Apache Collections (homepage) should fulfill your requirements. It has many implementations so maybe also one that keeps track of insertion order (your first point).

And it has:

  • removeAll
  • remove(count)

It is also quite fast compared to using a normal LinkedList or ArrayList but i'm not sure about having indices of elements inserted.

It is described as

Bag interface for collections that have a number of copies of each object

Upvotes: 1

Related Questions