Sam Delaney
Sam Delaney

Reputation: 335

Is there a limit to the amount of text I throw at Lucene.NET

I am looking into using Lucene.NET after reading some bad reviews of SQL Server's full text handling.

Should I be careful with how much data I give to Lucene.NET to index?

Also, to avoid extra database calls, what is the best practice for storing data in the index, like entry id, title etc?

EDIT: This also explains how much data lucene can handle.

Upvotes: 1

Views: 198

Answers (1)

Marcus
Marcus

Reputation: 977

Search driven web sites is not uncommon these days, where the search index acts as a repository/document db and serves data not only upon searching but also for generating the navigation and or facets. Lucene suits well for this purpose, Solr is even better. Use your SQL db data as master data and populate/rebuild the index at a frequency that suits you.

The larger the index the slower the querying will be, but Lucene can swallow a lot before being burdened by the index size.

The index should consist of all searchable data. If you are indexing people this could be their name and email address. You can skip touching the database at all if you also include in the index all properties that the People entity is composed of, even if they are not to be searchable. Another approach would be to include name, email and peopleID and nothing else, and querying the database by ID in order to get a People entity.

Upvotes: 2

Related Questions