The Bndr
The Bndr

Reputation: 13394

index size impact on search speed (to store or not to store)

right now, we are using Solr as an fulltext index, where all fields of the documents are indexed but not stored. There are some million documents, index-size is 50 GB. Average query-time is around 100ms.

To use features like Highlighting, we are thinking about to: additional store text. But, that could double the size of the index-files.

I know there is absolutely no (linear) relation between index size and query time. Rising the documents on factor 10 results in nearly no difference of query time.

But at all, the system (Solr/Lucene/Linux/...) has to handle more informations - the index files (for example) are based on much more I-nodes, and so on.

So I'm sure, there is an impact on query time in relation to the index-size. (But: is this noticeably?)

1st: Do you think, I'm right? Did you have any experiences on index-size and search speed in relation to with/without stored text? Is it smart and reasonable to blow up the index by storing the documents?

2nd: Do you know, how Solr/Lucene handled stored text? Maybe in separate files? (So that there is no impact for simples searches, where no stored text is needed!?)

Thank you.

Upvotes: 4

Views: 2185

Answers (1)

javanna
javanna

Reputation: 60195

Yes, it's absolutely true that the index grows if you make big fields stored, but if you want to highlight them, you don't have other ways. I don't think the speed will be decreased that much, maybe just because you need to download more data retrieving results, but it's not that relevant.

Regarding the lucene index format and the different files within the index you can have a look here: the stored fields are stored in a specific file.

Upvotes: 1

Related Questions