Gonçalo Cabrita
Gonçalo Cabrita

Reputation: 353

What is the best way to query the document closest to a date-time on elasticsearch?

I need to retrieve the document that has the closest geo location and date-time to the request, so I'm not looking for a match of the date-time, but the closest one. I solved it using a custom script, however I'm guessing there might be a better way to do it, similar to the way I'm filtering the geo location based on a location and a distance.

Here's my code (in python):

query = {
        "query": {
            "function_score": {
                "boost_mode": "replace",
                "query": {
                    "filtered": {
                        "query" : {
                            "match_all" : {}
                        },
                        "filter" : {
                            "geo_distance" : {
                                "distance" : "10km",
                                "location" : json.loads(self.request.body)["location"]
                            }
                        }
                    }
                },
                "script_score": {
                    "lang": "groovy",
                    "script_file": "calculate-score",
                    "params": {
                        "stamp": json.loads(self.request.body)["stamp"]
                    }
                }
            }
        },
        "sort": [
                    {"_score": "asc"}
        ],
        "size": 1
    }

    response = requests.get('http://localhost:9200/meteo/meteo/_search', data=json.dumps(query))

The custom calculate-score.groovy script contains the following:

abs(new java.text.SimpleDateFormat("yyyy-MM-dd\'T\'HH:mm").parse(stamp).getTime() - doc["stamp"].date.getMillis()) / 60000

The script returns the score as the absolute difference in minutes between the document date-time and the requested date-time.

Is there any other way to achieve this?

Upvotes: 1

Views: 3181

Answers (1)

keety
keety

Reputation: 17461

You should be able to use function_score to do this. You could use the decay functions mentioned in the doucmentation to give a larger score to documents closer to the origin timestamp. Below is the example where the scale=28800 mins i.e 20d.

Example:

put test
put test/test/_mapping
{
    "properties": {
          "stamp": {
                  "type": "date",
                  "format": "dateOptionalTime"
               }
    }
}
put test/test/1
{
    "stamp":"2015-10-15T00:00"
}

put test/test/2
{
    "stamp":"2015-10-15T12:00"
}


post test/_search
{
   "query": {
      "function_score": {
         "functions": [
            {
               "linear": {
                   "stamp" : {
                        "origin": "now",
                        "scale": "28800m"
                   }
               }
            }
         ],
         "score_mode" : "multiply",
         "boost_mode": "multiply",
         "query": {
            "match_all": {}
         }
      }
   }
}

Upvotes: 5

Related Questions