Search phrase through SOLR multivalued field

Question

I am implementing SOLR search. When I type "abc def" I want to get all paragraphs that contains "abc def". For example if I have those paragraphs.

{
    "paragraphs": ["abc def. bdbdbdbdbd, aa", "abd efe"]
},
{   
    "paragraphs": ["xyzabc def xyz", "fgh xx", "abcdef", "wwwabc defxxx"]
}

I want to get data from the first one. Exact match this prase so not a part of another phrase. If I search for "god dog" phrase"god doggo" should not be included in results.

The problem is when I try to use query paragraphs : "abc def" I am getting empty results.

This is part of my schema.xml:

I tried to use StandardTokenizerFactory instead of KeywordTokenizerFactory but result was the same. I can get data using (*abc*) but this returns also elements like xabcz and I am not interested in this.

MatsLindh · Accepted Answer

You'll have to drop the KeywordTokenizer - this keeps the whole, stored text as a single token.

Using the WhitespaceTokenizer or the StandardTokenizer should work, remember that you have to reindex after changing the analysis chain in any way (unless you're only changing how content is processed for querying).

Using the default dynamic field *_txt (defined as a StandardTokenizer with only lowercasing and stopword removal), and with your two documents indexed:

q=*:*

"response":{"numFound":2,"start":0,"docs":[
    {
        "paragraphs_txt":["abc def. bdbdbdbdbd, aa",
          "abd efe"],
        "id":"d696c435-2267-442d-9abe-ea754793d5cf",
        "_version_":1602547400543567872},
    {
        "paragraphs_txt":["xyzabc def xyz",
          "fgh xx",
          "abcdef",
          "wwwabc defxxx"],
        "id":"09bbba7c-b407-403c-9771-582ef23f6b56",
        "_version_":1602547400598093824}]
}}

q=paragraphcs_txt:"abc def"

"response":{"numFound":1,"start":0,"docs":[
    {
        "paragraphs_txt":["abc def. bdbdbdbdbd, aa",
          "abd efe"],
        "id":"d696c435-2267-442d-9abe-ea754793d5cf",
        "_version_":1602547400543567872}]
}}

Search phrase through SOLR multivalued field

Answers (1)

Related Questions