Ordered Data Structure that allows to efficiently remove duplicate items

Question

I need a data structure that

Must be ordered (adding elements a, b and c to an empty structure, will make them be at positions 0, 1 and 2).
Allows to add repeated items. This is, I can have a list with a, b, c, a, b.
Allows removing all ocurrences of a given item (if I do something like delete(1), it will delete all ocurrences of 1 in the structure). If I have elements a, b, c, d, c, e and remove element c, I should get a, b, d, e.
I just need to access the elements in two ways. The first, is when deleting a given ocorrence (see point above) and the other is when I convert the data I have in this structure to a list.

I can't really pick what the best data structure could be in here. I thought at first about something like a List(the problem is having an O(n) operation when removing items), but maybe I'm missing something? What about trees/heaps? Hashtables/maps?

I'll have to assume I'll do as much adding as removing with this data structure.

Thanks

DaveC · Accepted Answer

I think you may have to write a dedicated data structure (depending on your efficiency requirements).

Something like a doubly linked list with an extra nextEqualItemPtr in it and a HashMap pointing to the first of each item.

Then you can quickly find the first "b" to remove and follow all the nextEqualItemPtrs to remove them all (double linked so easy to keep list intact). Overhead is keeping the map up to date really. The nextEqualItemPtr list of a new item can just point to the node returned by map.put(key).nextEqualItemPtr

I would definitely use something simple first, and only plug this kind of thing in if/when it is too slow.

Ordered Data Structure that allows to efficiently remove duplicate items

Answers (2)

Related Questions