emilly
emilly

Reputation: 10530

Cassandra vs MongoDB in respect of Secondary Index?

At blog I see below statement

Secondary indexes are a first-class construct in MongoDB. This makes it easy to index any property of an object stored in MongoDB even if it is nested. This makes it really easy to query based on these secondary indexes. Cassandra has only cursory support for secondary indexes. Secondary indexes are also limited to single columns and equality comparisons.

I have two related questions on it .

Cassandra has only cursory support for secondary indexes.

Not sure what makes Mongo to have better support than cassandra which just has cursory support ?

Secondary indexes[Cassandra] are also limited to single columns

My understanding on above statement is that cassandra supports secondary indexes only on single column not on composite column

Secondary indexes[Cassandra] are also limited to equality comparisons. I believe it means Mongo can use index for greater/lass than operation also but cassandra is limited to equality ?

My understanding is Mongo implements the secondar index in similar fashion as RDBMS but not sure how cassandra implements it at high level ?

Upvotes: 1

Views: 1911

Answers (1)

dilsingi
dilsingi

Reputation: 2958

The major difference between Mongo & Cassandra is that, Mongo is mater/slave DB Vs Cassandra is a masterless system.

Having said that, all writes happen to a single PRIMARY node in Mongo which gets replicated to SECONDARIES. So the entire data is available in a single node and hence having a secondary index & querying by it becomes simpler (just like any other RDBMS). Hence it supports all types of queries against a secondary index. It gets complicated as we move towards a Mongo Sharded system vs a simple replicaset.

In case of Cassandra, the data is distributed into multiple nodes based on the partition key. Now building a secondary index is only relative to data on that particular node and is unaware about the data in other nodes. So querying for a column in secondary index results in scatter gather, because the matching data could lie in any node. As no partitioning key is involved to restrict number of nodes, this type of query would be very slow (depending on the cluster size).

So having a RANGE query in Mongo is no big deal, because the entire data is available in PRIMARY node. While Cassandra the range scan of secondary index would mean scanning every row in that given node. So a range query can take forever and hence not supported.

Upvotes: 5

Related Questions