mommomonthewind
mommomonthewind

Reputation: 4640

elasticsearch: How to search the results within last 30 seconds?

I have two Python client code. One runs each 30 seconds and submits data to elasticsearch. The other one runs each 30 seconds, download the data that is submitted by the first program and analyze the data.

In the second program, I want to limit the SEARCH function to get the data that is submitted in last 30 seconds only (because the earlier data is already downloaded).

I have a search command (modified from https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html):

POST /my_index_*/_search
{
  "size": 10,
  "query": {
    "match_all": {},
    "range": {
      "date": {
        "gte": "now-30s/d",
        "lt": "now/d"
      }
    }
  }
}

But it returns an error.

{
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "[match_all] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
        "line": 5,
        "col": 5
      }
    ],
    "type": "parsing_exception",
    "reason": "[match_all] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
    "line": 5,
    "col": 5
  },
  "status": 400
}

I think certainly the error comes from the range condition because without that I can get the data well.

How should I do that? And is there a better way than limit the duration of 30 seconds to make sure that the second program never has a data more than once, but also do not miss any data.

Many thanks

Upvotes: 0

Views: 2581

Answers (1)

Jose
Jose

Reputation: 289

To fix the parsing failure, you just need to remove the match_all from the query. But as a best practice you should move the range query to filter context as below.

{
    "query":
    {
        "bool":
        {
            "filter":
            {
                "range":
                {
                    "date":
                    {
                        "gte": "now-30s",
                        "lte": "now"
                    }
                }
            }

        }
    }

}

Please read more details here - https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html. Also I suppose you have added /d by mistake as it will round down the time to the nearest day. Please refer to DateMath documentation here - https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#date-math.

For finding the unprocessed records, solution should be based on the type of data you are dealing with. But I would suggest to build the base query based on the last processed timestamp, rather than querying for last 30 seconds.

Upvotes: 1

Related Questions