codersofthedark
codersofthedark

Reputation: 9645

Cassandra or SOLR? What gives better performance to frond end read queries?

My team has asked me to choose between Cassandra and SOLR for faster response @ frond end queries. I told them that Cassandra is NOSQL db thing while SOLR is indexing thing. But then they say that we can push our complete db to SOLR (like using SOLR as db) or we can just use Cassandra with SOLR. All confused.

Amount of data we are dealing is like 1 Billion spread over 4 MySQL table(fetched using joins) and we get only read queries from the website. We dont need FULL TEXT SEARCH

I think something in which SOLR cannot be beated easily is is its full text search feature but then we dont need it on our case.

So what else SOLR has which Cassandra cannot provide and what does Cassandra has that it can replace SOLR in our particular case?

In other words, who is going to perform better? Cassandra alone? SOLR as a db alone? Or both together? And most importantly why and why not?

Its really important for me to backup my choice with strong point as if why one is better than other during my next team meeting.

And thanks in advance.

EDIT:

Upvotes: 16

Views: 18928

Answers (4)

Alexis Hope
Alexis Hope

Reputation: 21

Solrs indexing features would out perform Cassandra for reads. It'll index popular queries so frequent ones will be faster still. It was built for reads, cassandra is built to store. But as already stated Cassandra will scale awesomly if that's needed. Why not benchmark single node, 1 mill random text strings, 1mill query average. Either of em will out perform mysql let alone mysql join queries. PS solr will soon support joins I think solr 4.....

Upvotes: 2

user349026
user349026

Reputation:

  • Cassandra is a NoSQL data store and it was designed to take care of huge amounts of data. Tera bytes and beyond. Definitely it was designed to perform.
  • Remember that NoSQL DB's or data stores have limited capabilities when it comes to queries. They will not have JOIN queries. As this will kill a system. Think about it!
  • You would definitely be able to read/write pretty fast and some of the data can be queried.
  • Flexible schema, you can push sparse data into it. That is, where in general DB's you push NULL for an empty entry, here you dont push it at all :) You don't need to!
  • No full text searching.

This is where the big BUT comes in.

  • Having said the above, SOLR on the other end is TF-IDF full text search engine. Though you can use it for your DB.
  • Flexible Schema. Just mark fields that are not required.
  • Solr will help in tokenizing, parsing and indexing the data pretty quickly. It will have a superb response. It returns XML and you can parse the XML to create data that is representable.
  • Read queries are fast and I mean really fast. But I have no comparison between Cassandra and SOLR to share.

And in the end, since you want CASSANDRA and SOLR together. Check out SOLANDRA (former Lucandra)

Upvotes: 5

Tyler Hobbs
Tyler Hobbs

Reputation: 6932

If you don't need Solr's full-text search capabilities, there's very little reason to choose it over Cassandra, in my opinion.

(Disclosure: I work for DataStax.)

Operationally, handling a Cassandra cluster will be much simpler due to the Dynamo-based architecture. Sharding Solr can be quite painful, which is one of the big reasons why we at DataStax built search into DSE; it's something that a lot of people want to avoid. I'm not trying to sell you on DSE, just pointing out the downside to Solr.

For example, when you want to change the number of shards with Solr, you have to create and build an entirely new index. You have to worry about deadlock with a Solr cluster. There are several other limitations: http://wiki.apache.org/solr/DistributedSearch

You haven't said much about what kind of queries you need to be able to support. Adding that info would get you better answers.

Upvotes: 9

Marko Bonaci
Marko Bonaci

Reputation: 5708

You can also take a look at Datastax
There's Community and Enterprise edition, though I think Solr isn't included in community edition :(

Solandra is not being actively developed any more, the author moved to Datastax and continued his work there.

IMHO what Cloudera is for Hadoop, that's Datastax for Cassandra.

Upvotes: 4

Related Questions