Reputation: 58796
I wish to store many records in a clojure hash table. If I wish to get fast access to certain records using a certain field or range query then what options do I have, without having to resort to storing the data in a database (which is where the data came from in the first place).
I guess I'm also wondering whether an STM is the right place for a large indexed data set as well.
Upvotes: 3
Views: 490
Reputation: 70241
Depending how far you want to push this, you're asking to build an in-memory database. I assume you don't actually want to do that or presumably use one of the many in-memory Java databases that already exist (Derby, H2, etc).
If you want indexed or range access to multiple attributes of your data, then you need to create all of those indexes in Clojure data structures. Clojure maps will give you O(log32 n) time access to data (worse than constant, but still very bounded). If you need better than that, you can use Java maps like HashMap or ConcurrentHashMap directly with the caveat that you're outside the Clojure data model. For range access, you'll want some sort of sorted tree data structure... Java has ConcurentSkipListMap which is pretty great for what it does. If that's not good enough, you might need your own btree impl.
If you're not changing this data, then Clojure's STM is immaterial. Is this data treated as a cache of a subset of the database? If so, you might consider using a cache library like Ehcache instead (they've recently added support for very large off-heap caches and search capabilities).
Balancing data between in-memory cache and persistent store is a tricky business and one of the most important things to get right in data-heavy apps.
Upvotes: 5
Reputation: 106391
You'll probably want to create separate indexes for each field using a sorted-map so that you can do range queries. Under the hood this uses something like a persistent version of a Java TreeMap.
STM shouldn't be an issue if you are mostly interested in read access. In fact it might even prove better than mutable tables since:
Upvotes: 2