Reputation: 261
Earlier my index was using lucene analyzer. I changed it to Microsoft. Now the size of index has largely increased. Why does the size increase so much . ? P.S. the attachment.
Upvotes: 0
Views: 124
Reputation: 1972
Difference in index size is expected. For each word in your documents a Microsoft analyzer produces the original word and the base form of that word, for example, if your document has the word running, Azure Search will index two terms: running and run. See my answer in the following post for more details: Azure Search: Searching for singular version of a word, but still include plural version in results
Lucene analyzers stem words what results in fewer unique terms in the index. You can learn more about the differences here: https://learn.microsoft.com/en-us/rest/api/searchservice/Language-support?redirectedfrom=MSDN
Depending on the analyzer/language the impact on the index size will be different. You can test the behavior of the analyzer you are using with the Analyze API: https://learn.microsoft.com/en-us/rest/api/searchservice/test-analyzer.
That being said, the difference you are seeing is more than I would expect. Please reach out to me at janusz.lembicz at microsoft to discuss the details of your scenario.
Upvotes: 2