Maxime Détaille
Maxime Détaille

Reputation: 11

Elasticsearch [5.4] scoring by word position in a sentence

I search a solution for my search engine based on ES 5.4. I need when my users wrote a request, to gave a better scoring if one of my results starts with the same words.

Its pretty logic "for me" that when my client write "disque de frein" in my search bar, all first results used to be

"disque de frein pour golf"
"disque de frein pour porsche"
"disque de frein pour audi"
then
"protection de disque de frein"

But now, the last results get a better score than the others.

By the way, i use FOSElasticaBundle, but i just need to know how i could get the good results.

There is my query :

{
  "query":{
    "bool":{
      "should":[
        {
          "match":{
            "produitLibFr":{
              "query":"disque de frein"
            }
          }
        }]
    }
  }
}

results :

{
    "_index": "app",
    "_type": "produit",
    "_id": "41666191",
    "_score": 10.558487,
    "_source": {
      "produitRef": "FA42436",
      "produitLibFr": "Pied à coulisse pour disques de frein",
      "produitLibEs": "",
      "produitLibGb": "",
      "produitLibNl": "",
      "produitLibPt": ""
    }
  },


{
    "_index": "app",
    "_type": "produit",
    "_id": "41666369",
    "_score": 10.5379715,
    "_source": {
      "produitRef": "FA43075",
      "produitLibFr": "Module freins à disques",
      "produitLibEs": "",
      "produitLibGb": "",
      "produitLibNl": "",
      "produitLibPt": ""
    }
  },

{
    "_index": "app",
    "_type": "produit",
    "_id": "67938479",
    "_score": 9.800581,
    "_source": {
      "produitRef": "GH28306",
      "produitLibFr": "Disque de frein arrière pour Scirocco & Corrado",
      "produitLibEs": "1 Disco de freno trasero para Scirocco & Corrado",
      "produitLibGb": "1 Rear brake disc for Scirocco & Corrado",
      "produitLibNl": "1 remschijf voor de achterrem voor Scirocco & Corrado",
      "produitLibPt": ""
    }
  }

And I want the last one being in the first position ...

Did you have a clue for me ?

Thanks !

Edit

I try @Lupanoide solution and i got these results :

Appareil de mesure pour disques de frein - 30.441166
Accessoires pour mesurer les disques de frein - 30.441166
Appareil de mesure pour disques de frein - 30.204206
Accessoires pour mesurer les disques de frein - 29.945782
Protecteur de disque de frein arrière droit - 28.547125
And then the good one :
Disque de frein arrière pour Scirocco & Corrado - 28.547125

The score are increase but there is still the same problem. And the boost param is interesting but we can only boost negatively a field, and i need to boost positively the match_phrase_prefix field

Upvotes: 0

Views: 449

Answers (1)

Lupanoide
Lupanoide

Reputation: 3222

Try:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "produitLibFr": {
              "query": "disque de frein"
            }
          }
        }
      ],
      "should": [
        {
          "match_phrase_prefix": {
            "produitLibFr": "disque de frein"
          }
        }
      ]
    }
  }
}

This query matches two cases:

  • in the first one the doc retrieved must have "disque de frein" in the "produitLibFr" field
  • in the second one the doc retrieved should have "disque de frein" at the beginning of the content of "produitLibFr" field.

If a doc matches both clauses will have a higher rating score. To increase the score value you could consider to use the boost param - doc here

Upvotes: 1

Related Questions