multi_index_container

Question

So I have a boost::multi_index_container with multiple non-unique indexes. I would like to find an elegant way to do an relational-database style query to find all elements that match a set of criteria using multiple indexes.

For instance, given a list of connections between devices, I'd like to search for all elements whose source is 'server' and whose destination is 'pc2'. I've got an index on Source and an index on Dest.

Source     Dest/Port
----       ---------
server     pc1  23
server     pc1  27
server     pc1  80
server     pc2  80    <- want to find these two
server     pc2  90    <-
printer    pc3  110
printer    pc1  110
scanner    mac  8080

Normally I might do lower_bound and upper_bound on the first index (to match 'server'), then do a linear search between those iterators to find those elements that match in the "Dest" column, but that's not very satisfying, since I've got a second index. Is there an elegant stl/boost-like way to take advantage of the fact that there are two indexes and avoid a linear search (or an equivalent amount of work, such as adding all intermediate results to another container, etc.)?

(Obviously in the example, a linear search would be fastest, but if there were 10000 items with 'server' as the source, having the second index would start to be nice.)

Any ideas are appreciated!

Matthieu M. · Accepted Answer

You might simply get some inspiration from relational databases...

... but first we need to demystify a thing about indexes.

Compound Indexes

In a relational database there are two types of indexes:

regular indexes: an index on one column
compound indexes: an index on multiple columns at once

The two give different performance results. When you need to use two indexes, there is a merge pass to combine the results given by them (also called join), therefore compound indexes can provide a speed boost.

Multi-Index

Boost multi-index can use compounds indexes, you are free to provide your own hashing or comparison function after all.

A key difference with a relational database is that you cannot have an efficient merge pass (merging two ROWID sets) because this require intrinsic knowledge to be efficient, therefore you are indeed stuck with a linear search among the results of the first search. It is up to you to find the most discriminant first search.

Note: the name multi-index refers to the idea that it automatically maintains multiple index when you insert, update and delete your elements. It also means that you can search using any of those indexes with a performance profile that you decided. But it is not a full-blown database engine with statistics, heuristics and a query engine.

multi_index_container

Answers (2)

Related Questions