Jack A.
Jack A.

Reputation: 4443

Solr suggester returning terms from deleted documents

I have a SolrCloud setup and I'm testing the suggestion component. I have several hundred documents in the index. I did not want some of the documents in the index because they contain gibberish (they were binary files that got improperly converted to text). I've removed them from the index, but the gibberish words from them are still showing up in the suggestions.

My suggest configuration looks like this:

<searchComponent name="suggest" class="solr.SuggestComponent">
    <lst name="suggester">
        <str name="name">fuzzySuggester</str>
        <str name="lookupImpl">FuzzyLookupFactory</str>
        <str name="dictionaryImpl">HighFrequencyDictionaryFactory</str>
        <str name="storeDir">suggester_fuzzy_dir</str>
        <str name="field">dictionary_text</str>
        <str name="suggestAnalyzerFieldType">phrase_suggest</str>
        <str name="exactMatchFirst">true</str>
        <float name="threshold">0.001</float>
        <str name="buildOnStartup">false</str>
        <str name="buildOnCommit">true</str>
    </lst>
</searchComponent>

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
    <lst name="defaults">
        <str name="suggest">true</str>
        <str name="suggest.dictionary">fuzzySuggester</str>
        <str name="suggest.onlyMorePopular">true</str>
        <str name="suggest.count">5</str>
        <str name="suggest.collate">true</str>
    </lst>
    <arr name="components">
        <str>suggest</str>
    </arr>
</requestHandler>

Note that buildOnCommit is set to true. I also tried to remove them using a /suggest query with the suggest.build=true parameter, but that had no effect.

Is there something else required to remove terms from the dictionary?

Upvotes: 1

Views: 248

Answers (1)

Jack A.
Jack A.

Reputation: 4443

Despite using expungeDeletes=true in the update, the deleted documents were still hanging around. Optimizing removed them and appears to have removed all the gibberish terms from suggestions.

Upvotes: 1

Related Questions