Reputation: 42642
Previously, while I am looking for the data structure with the following characteristic.
In Java, I use LinkHashSet, In Python, I use OrderedDict
Now, on the top of 2 requirements, I would like to have an additional requirement
data[123]
Is there any data structure available? Or I need to fall back to use List
? List
able to full fill 2nd and 3rd requirement, but not 1st. I may need to perform manual (and slow check) during insertion, to avoid duplication?
Upvotes: 0
Views: 5740
Reputation: 40851
If you can use an immutable collection, use an ImmutableSet from Guava, which has an asList() view to provide indexed access.
Upvotes: 1
Reputation: 718678
A simple approach in Java would be to create a wrapper class that implements both the Set
and List
interfaces, and that contained both a HashSet
and ArrayList
. Update operations would need to update both internal collections, and read operations would be mapped to whichever internal collection gives the correct semantics and best performance. The only slightly tricky method is iterator()
where you need to arrange that remove
updates both collections.
This approach will give you the "best of both worlds" performance for read operations, but updates will necessarily be slower. In particular, insert at a given position and remove will be O(N)
operations.
(I'd note that LinkedHashSet is not a direct solution since it does not provide a get(int)
method. You could implement this method via the LinkedHashSet iterator, by that makes it an O(N)
operation. Probably not what you want.)
FOLLOW UP
I've not been able to find a general purpose implementation class that implements both the Set
and List
interfaces. I think that the reason is that there are semantic anomalies when you combine the interfaces. For example, (as @ColinD notes) if you call E set(int, E)
with an element that it already in the list, it is not clear what the result should be. Dealing with this in a way that satisfies everyone is probably impossible, and I can understand why they might have decided not to go swimming in the tarpit.
However, I don't see this as a major problem if you are creating a Set
+ List
class for your application's internal use. You either
(For instance, you might code it to either ignore result of the set
method, to throw an unchecked exception if there is a duplicate, or return null
or some distinguished object if there is a duplicate.)
For the record, it is not unforgivable for a custom collection class to violate the interface contract. Indeed, even the Java designers do it - see IdentityHashMap. What is unforgivable is not documenting the contract violations in the javadocs.
Upvotes: 3
Reputation: 57464
You're not going to find a basic data structure that does this; the goals you're looking for rule out all of them. You might find an more esoteric one that'll do it, but the easiest approach is to use a compound data structure, maintaining two data structures in parallel.
That's what collections.OrderedDict
does under the hood, in fact. That's not what you want, though: since it's not designed to support indexing, it uses a linked list under the hood for preserving order. Linked lists can't do indexing--short of a slow, linear scan, which you usually want to avoid since it tends to turn O(n^2) on you if used in a loop.
Here's a simple implementation. It maintains two data structures: a list, preserving the order of items as they're set, and a dict, for quick lookup by key. Both hold the value, and both hold the other's key: the dict holds the index within the list, and the list holds the key within the dict. This makes it easy to reference each data structure from the other, so it can handle both assignment and iteration efficiently.
Note that this doesn't implement every operation, just the basic ones: dict-style assignment a['x'] = 1
, dict-style lookup a['x']
, list-style assignment a.set_value_by_index(0, 1)
and list-style lookup a.get_value_by_index(0)
.
Also note carefully: this doesn't use the same syntax for dict-style and list-style operations. That's confusing and evil, and will bite you badly sooner or later. This doesn't turn a[0]
into a list-style lookup; if that's what you want, be explicit and use get_value_by_index
. Don't be magic and try to guess based on the parameter type.
Finally, it provides simple dict-style iteration, yielding the keys as a dict does. Implementing things like iteritems
and itervalues
or Python3 views are obvious extensions.
class IndexableUniqueList(object):
"""
>>> a = IndexableUniqueList()
>>> a['x'] = 1
>>> a['x']
1
>>> a['y'] = 2
>>> a['y']
2
>>> a.get_key_by_index(0)
'x'
>>> a.get_value_by_index(0)
1
>>> a.get_key_by_index(1)
'y'
>>> a.get_value_by_index(1)
2
>>> a['x'] = 3
>>> a.get_key_by_index(0)
'x'
>>> a.get_value_by_index(0)
3
>>> a.set_value_by_index(0, 4)
>>> a['x']
4
>>> [val for val in a]
['x', 'y']
"""
def __init__(self):
self.items_by_index = []
self.items_by_key = {}
def __getitem__(self, key):
return self.items_by_key[key][1]
def __setitem__(self, key, value):
if key in self.items_by_key:
idx, old_value = self.items_by_key[key]
self.items_by_key[key] = (idx, value)
self.items_by_index[idx] = (key, value)
return
idx = len(self.items_by_index)
self.items_by_key[key] = (idx, value)
self.items_by_index.append((key, value))
def get_key_by_index(self, idx):
return self.items_by_index[idx][0]
def get_value_by_index(self, idx):
key = self.get_key_by_index(idx)
return self.items_by_key[key][1]
def set_value_by_index(self, idx, value):
key = self.items_by_index[idx][0]
self[key] = value
def __iter__(self):
for key, value in self.items_by_index:
yield key
Upvotes: 0
Reputation: 14659
The LinkedHashSet class provides the toArray-Method, which should fit your needs.
Upvotes: 0
Reputation: 4290
java.util.Set
doesn't provide random access methods like get() and set(), so most/all of its implementations don't either. You could create your own implementation of Set
that will provide this, maybe with an ArrayList to hold the data.
Upvotes: 0