Angelina
Angelina

Reputation: 2265

elasticsearch not returning text when entered partial word

I have my analyzers set like this:

"analyzer": {
    "edgeNgram_autocomplete": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": ["lowercase", "autocomplete"]
    },                
    "full_name": {
        "filter":["standard","lowercase","asciifolding"],
        "type":"custom",
        "tokenizer":"standard"
    }

My filter:

"filter": {
    "autocomplete": {
        "type": "edgeNGram",
        "side":"front",
        "min_gram": 1,
        "max_gram": 50
    } 

Name field analyzer:

"textbox": {
    "_parent": {
        "type": "document"
    },            
    "properties": {
        "text": {
            "fields": {
                "text": {
                    "type":"string",
                    "analyzer":"full_name"
                },
                "autocomplete": {
                    "type": "string",
                    "index_analyzer": "edgeNgram_autocomplete",
                    "search_analyzer": "full_name",
                    "analyzer": "full_name"
                }
            },
            "type":"multi_field"
        }
    }
}

Put all together, makes up my mapping for docstore index:

PUT http://localhost:9200/docstore
{
    "settings": {
        "analysis": {
            "analyzer": {
                "edgeNgram_autocomplete": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": ["lowercase", "autocomplete"]
                },                
                "full_name": {
                   "filter":["standard","lowercase","asciifolding"],
                   "type":"custom",
                   "tokenizer":"standard"
                }
            },
            "filter": {
                "autocomplete": {
                    "type": "edgeNGram",
                    "side":"front",
                    "min_gram": 1,
                    "max_gram": 50
                }           }
        }
    },
    "mappings": {
        "space": {
            "properties": {
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        },
        "document": {
            "_parent": {
                "type": "space"
            },
            "properties": {
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        },
        "textbox": {
            "_parent": {
                "type": "document"
            },            
            "properties": {
                "bbox": {
                    "type": "long"
                },
                "text": {
                    "fields": {
                        "text": {
                            "type":"string",
                            "analyzer":"full_name"
                        },
                        "autocomplete": {
                            "type": "string",
                            "index_analyzer": "edgeNgram_autocomplete",
                            "search_analyzer": "full_name",
                            "analyzer":"full_name"
                        }
                    },
                    "type":"multi_field"
                }
            }
        },
        "entity": {
            "_parent": {
                "type": "document"
            },
            "properties": {
                "bbox": {
                    "type": "long"
                },
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}

Add a space to hold all docs:

POST http://localhost:9200/docstore/space
{
    "name": "Space 1"
}

mapping

When user enters word: proj

this should return, all text:

But it returns nothing.

My query:

POST http://localhost:9200/docstore/textbox/_search
{
    "query": {
        "match": {
            "text": "proj"
        }
    },
    "filter": {
        "has_parent": {
            "type": "document",
            "query": {
                "term": {
                    "name": "1-a1-1001.pdf"
                }
            }
        }
    }
}

If I search by project, I get:

{ "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 2,
        "max_score": 3.0133555,
        "hits": [
            {
                "_index": "docstore",
                "_type": "textbox",
                "_id": "AVRuV2d_f4y6IKuxK35g",
                "_score": 3.0133555,
                "_routing": "AVRuVvtLf4y6IKuxK33f",
                "_parent": "AVRuV2cMf4y6IKuxK33g",
                "_source": {
                    "bbox": [
                        8750,
                        5362,
                        9291,
                        5445
                    ],
                    "text": [
                        "Sample Project"
                    ]
                }
            },
            {
                "_index": "docstore",
                "_type": "textbox",
                "_id": "AVRuV2d_f4y6IKuxK35Y",
                "_score": 2.4106843,
                "_routing": "AVRuVvtLf4y6IKuxK33f",
                "_parent": "AVRuV2cMf4y6IKuxK33g",
                "_source": {
                    "bbox": [
                        8645,
                        5170,
                        9070,
                        5220
                    ],
                    "text": [
                        "Project Name and Address"
                    ]
                }
            }
        ]
    }
}

Maybe my edgengram is not suited for this? I am saying:

side":"front"

Should I do it differently?

Does anyone know what I am doing wrong?

Upvotes: 0

Views: 363

Answers (2)

Eyal.Dahari
Eyal.Dahari

Reputation: 770

The problem is with the autocomplete indexing analyzer field name.

Change:

"index_analyzer": "edgeNgram_autocomplete"

To:

"analyzer": "edgeNgram_autocomplete"

And also search like (@Andrei Stefan) showed in his answer:

POST http://localhost:9200/docstore/textbox/_search
{
    "query": {
        "match": {
            "text.autocomplete": "proj"
        }
    }
}

And it will work as expected!

I have tested your configuration on Elasticsearch 2.3

By the way, type multi_field is deprecated.

Hope I have managed to help :)

Upvotes: 1

Andrei Stefan
Andrei Stefan

Reputation: 52366

Your query should actually try to match on text.autocomplete and not text:

  "query": {
    "match": {
      "text.autocomplete": "proj"
    }
  }

Upvotes: 1

Related Questions