Reputation:
I'm trying to get a good natural language search going in a website, and trying to understand the advantages of Apache Solr vs Xapian. Xapian seems easier to set up. Do both offer good natural language searches? Any insight appreciated.
Upvotes: 9
Views: 3761
Reputation: 1221
Xapian is more like Lucene, a library that you integrate with your application. If you have a C++ app, then Xapian might be a better match. If you have a Java application, Lucene is almost certainly the best choice.
If you want a search server, then compare Omega (built on Xapian) to Solr (built on Lucene). I have not used Omega or Xapian, but Solr has a few features that I have come to depend on, especially the per-field analysis chains. That is a brilliant idea, and one that I wish I had thought of when I was working on Ultraseek.
It is quite easy to extend the Solr analysis chain with your own Java class. I expect that would be more difficult in C++ with Omega/Xapian.
The two engines use different underlying relevance models. Xapian is a probabilistic engine, Lucene is a vector space engine. I have seen both models tuned to perform well, so that might not be a reason to decide.
The Solr/Lucene community is large and very helpful.
Upvotes: 7