sanity
sanity

Reputation: 35772

In-memory data structure that supports boolean querying

I need to store data in memory where I map one or more key strings to an object, as follows:

"green", "blue" -> object1
"red", "yellow" -> object2

So, in Java the datastructure might implement:

Map<Set<String>, V>

I need to be able to efficiently receive a list of objects, where the strings match some boolean criteria, such as:

("red" OR "green") AND NOT "blue"

I'm working in Java, so the ideal solution would be an off-the-shelf Java library. I am, however, willing to implement something from scratch if necessary.

Anyone have any ideas? I'd rather avoid the overhead of an in-memory database if possible, I'm hoping for something comparable in speed to a HashMap (or at least the same order of magnitude).

Upvotes: 4

Views: 1237

Answers (9)

sanity
sanity

Reputation: 35772

I wasn't able to find a satisfactory solution, so I decided to cook up my own and release it as an open source (LGPL) project, find it here.

Upvotes: 0

Carl
Carl

Reputation: 7554

The Google Collections SetMultimap looks like an easy way to get the underlying structure, then combining that with the Maps static filters to get the querying behavior you want.

Construction would go something like

smmInstance.put(from1,to1);
smmInstance.put(from1,to2);
smmInstance.put(from2,to3);
smmInstance.put(from3,to1);
smmInstance.put(from1,to3);
//...

queries would then look like

valueFilter = //...build predicate
Set<FromType> result = Maps.filterValues(smmInstance.asMap(),valueFilter).keySet()

You can do any amount of fancy building the predicate, but Predicates has a few methods that would probably be enough to do contains/doesn't contain style queries.

Upvotes: 0

Persimmonium
Persimmonium

Reputation: 15791

this would have worked too reusable condition/expression classes

Upvotes: 0

Sean Patrick Floyd
Sean Patrick Floyd

Reputation: 298938

Actually, I liked the problem so I implemented a full solution in the spirit of my earlier answer:

http://pastebin.com/6iazSKG9

A simple solution, not thread safe or anything, but fun and a good starting point, I guess.

Edit: Some elaboration, as requested


See the unit test for usage.

There are two interfaces, DataStructure<K,V> and Query<V>. DataStructure behaves somewhat like a map (and in my implementation it actually works with an internal map), but it also provides reuseable and immutable query objects which can be combined like this:

    Query<String> combinedQuery = 
    structure.and(
                    structure.or(
                            structure.search("blue"), 
                            structure.search("red")
                    ),
                    structure.not(
                            structure.search("green")
                    )
    );

(A Query that searches for objects that are tagged as (blue OR red) AND NOT green). This query is reuseable, which means that it's results will change whenever the backing map is changed (kind of like an ITunes smart playlist).

The query objects are already thread safe, but the backing map is not, so there is some room for improvement here. Also, the queries could cache their results, but that would probably mean that the interface would have to be extended to provide for a purge method (kind of like the detach method in Wicket's models), which wouldn't be pretty.

As for licensing: if anybody wants this code I'll be happy to put it on SourceForge etc. ...

Sean

Upvotes: 6

leora
leora

Reputation: 196539

i truly think some type of database solution is your best bet. SQL easily supports querying data by

(X and Y) and not Z

Upvotes: 0

Puppy
Puppy

Reputation: 146940

You could map string keys to a binary constant, then use bit shifting to produce an appropriate mask.

Upvotes: 0

Timothy
Timothy

Reputation: 2477

Check out the Apache Commons - Collections project. They have tons of great stuff that you will be able to use, particularly the CollectionUtils class for performing strong collection-based logic.

For instance, if your values were stored in a HashMap (as suggested by another answer) as follows:

myMap["green"] -> obj1
myMap["blue"] -> obj1
myMap["red"] -> obj2
myMap["yellow"] -> obj2

Then to retrieve results that match: ("red" or "green") and not "blue you might do this:

CollectionUtils.disjunction(CollectionUtils.union(myMap.get("red"), myMap.get("green")), myMap.get("blue"))

Upvotes: 0

pdbartlett
pdbartlett

Reputation: 1519

Would the criteria be amenable to bitmap indexing: http://en.wikipedia.org/wiki/Bitmap_index ?

Upvotes: 1

aioobe
aioobe

Reputation: 421040

I would say that the easiest way is simply to do a recursive filtering and being cleaver, when for instance evaluating X AND Y where X has been evaluated to the empty set.

The mapping however, needs to be from tags (such as "red" or "blue") to sets of objects.

The base case (resolving the atomic tags) of the recursion, would then be a simple lookup in this map. AND would be implemented using intersection, OR using union, and so on.

Upvotes: 0

Related Questions