user2628641
user2628641

Reputation: 2154

Architecture for handling multiple user searching query at same time

I am working on a problem similar to shopping online at Amazon.

There are many products and their information are stored in a database, the users can enter a search term and a list of closest matched products will be returned.

Currently I am using Lucene to handle the searching process. It's a very simple architecture:

  1. When a user enters a search term, Lucene will go through the whole database to index every product
  2. Then Lucene will return a list of best matched products

The problem with the prototype is, when there are many user querying, for each user, lucene needs to loop through entire database to index. And if the user queries, then logs out, then queries again, Lucene also needs to loop through again.. The speed is pretty slow.

So what are some way to improve this (or technology choice)?

Upvotes: 0

Views: 304

Answers (2)

D C Sahu
D C Sahu

Reputation: 49

You do not have to build indexes on every Search. You can build your indexes(replace the previous one) when your server starts. Once done, you can perform search referring to your indexes and not your database. This will be quite fast.

Now there may be a chance that some of the product price may change or any data related to any product may change/added/updated. In that case you can update the information into your database and when your server restarts, you can rebuild your indexes.

I would rather prefer to update the indexes than creating again and again on server restart. For this you can have a field like "last_updated_date" in your database as well as in your index. For every product this "last_updated_date" field will tell you where is the updated information is present. So on server restart you can make a list of product which needs to be updated and execute your logic.

Upvotes: 1

Rob Conklin
Rob Conklin

Reputation: 9464

When you create your lucene index (using an indexwriter), you should use a FSDirectory object to get a file. This is where your index is stored. Users should use a shared IndexSearcher to search this index.

IndexSearcher is thread safe (and relatively expensive to create), so you definitely want to keep it around after you use it.

I think you are going to be very impressed with the performance of this once you keep these things around.

Please take a look at this tutorial:http://oak.cs.ucla.edu/cs144/projects/lucene/ , it looks fairly good.

Upvotes: 1

Related Questions