Reputation: 19895
Here is a tricky data structure and data organization case.
I have an application that reads data from large files and produces objects of various types (e.g., Boolean
, Integer
, String
) that are categorized in a few (less than a dozen) groups and then stored in a database.
Each object is currently stored in a single HashMap<String, Object>
data structure. Each such HashMap
corresponds to a single category (group). Each database record is built from the information in all the objects contained in all categories (HashMap data structures).
A requirement has appeared for checking whether subsequent records are "equivalent" in the number and type of columns, where equivalence must be verified across all maps by comparing the name (HashMap
key) and the type (actual class) of each stored object.
I am looking for an efficient way of implementing this functionality, while maintaining the original object categorization, because listing objects by category in the fastest possible way is also a requirement.
An idea would be to just sort the keys (e.g., by replacing each HashMap
with a TreeMap
) and then walk over all maps. An alternative would be to just copy everything in a TreeMap
for comparison purposes only.
What would be the most efficient way of implementing this functionality?
Also, if how would you go about finding the difference (i.e., the fields added and those removed), between successive records?
Upvotes: 1
Views: 543
Reputation: 3096
Create a meta SortedSet in which you store all the created maps.
Means SortedSet<Map<String,Object>>
e.g. a TreeSet
which as a custom Comparator<Map<String,Object>>
which does check exactly your requirements of same number and names of keys and same object type per value.
You can then use the contains() method of this meta set structure to find out if a similar record does already exist.
==== EDIT ====
Since I've misundertood the relation between database records and the maps in the first place, I've to change some semantics my answer now of course a little bit.
Still I'would use the mentioned SortedSet<Map<String,Object>>
but of course the Map<String,Object>
would now point to that Map you and havexy suggested.
On the other hand could it be a step forward to use a Set<Set<KeyAndType>>
or SortedSet<Set<KeyAndType>>
where your KeyAndType will only contain the key and the type with appropriate Comparable
implementation or equals with hashcode
.
Why? You asked how to find the differences between two records? If each record relates to one of those inner Set<KeyAndType>
you can easily use retainAll()
to form the intersection of two successive Sets.
If you would compare this to the idea of a SortedSet<Map<String,Object>>
, in both ways you would have the logic which differenciates between the fields within the comparator, one time comparing inner sets, one time comparing inner maps. And since this information gets lost when the surrounding set is constructed, it will be hard to get the differences between two records later on, if you do not have another reduced structure which is easy to use to find such differences. And since such a Set<KeyAndType>
could act as key as well as as easy base for comparison between two records, it could be a good candidate to be used for both purposes.
If furthermore you wanna keep the relation between such a Set<KeyAndType>
to your record or the group of Map<String,Object>
your meta structure could be something like:
Map<Set<KeyAndType>,DatabaseRecord>
or Map<Set<KeyAndType>,GroupOfMaps>
implemented by a simple LinkedHashMap
which allows simple iteration in original order.
Upvotes: 2
Reputation: 9590
One soln is to keep both category based HashMap
and combined TreeMap
. This will have slight more memory requirement, not much though, as you ll just keep the same reference in both of them.
So whenever you are adding/removing to HashMap
you will do the same operation in the TreeMap
too. This way both will always be in sync.
You can then use TreeMap for comparison, whether you want comparison of type of object or actual content comparison.
Upvotes: 2