Maulik Doshi
Maulik Doshi

Reputation: 323

Elasticsearch aggregation on date range

suppose following is mapping for employee working :

{
    "Project":
    {
         "startDate":{"type":"Date"},
         "endDate":{"type":"Date"},
         "employees":{"type":"keyword"}
    }
}

PFB sample data:

{
    "Project1":
    {
         "startDate":"2019-07-01",
         "endDate":"2019-07-03"
         "employees":["emp1","emp2"]
    }
},
{
    "Project2":
    {
         "startDate":"2019-07-02",
         "endDate":"2019-07-04"
         "employees":["emp3","emp4"]
    }
}

Here employees have list of employees working on that project. I am unable to write aggregation query which gives me number of employees working on each day. i.e. Employee is working when he is any one or more projects having startdate <= currentDate and endDate >=currentDate.

I want following result:

{
    {
     "key":"2019-07-01",
     "EmployeeCount":2
    },
    {
     "key":"2019-07-02",
     "EmployeeCount":4
    },
    {
     "key":"2019-07-03",
     "EmployeeCount":4
    },
    {
     "key":"2019-07-04",
     "EmployeeCount":2
    }
}

Can you guide me which aggregation will help me solve this?

Upvotes: 1

Views: 191

Answers (1)

Kamal Kunjapur
Kamal Kunjapur

Reputation: 8840

I'm afraid what you are looking for is not possible.

Elasticsearch doesn't support the date histogram aggregation in such a way to pick up the value from two different date fields, which in your case is startDate and endDate.

The only way to get what you want is:

  • Get the employee count while performing aggregation on startDate
  • Get the employee count while performing aggregation on endDate
  • Manage the logic to sum the results in your service layer.

Below is the sample aggregation for calculating the count of employees using startDate.

Sample Index

PUT mysampleindex
{  
  "mappings": {
    "properties": {  
       "project":{
         "properties": { 
            "startDate":{  
               "type":"date"
            },
            "endDate":{  
               "type":"date"
            },
            "employees":{  
               "type":"keyword"
            }
         }
      }
    }
  }
}

Sample Documents

POST mysampleindex/_doc/1
{
    "project":
    {
         "startDate":"2019-07-01",
         "endDate":"2019-07-03",
         "employees":["emp1","emp2"]
    }
}

POST mysampleindex/_doc/2
{
    "project":
    {
         "startDate":"2019-07-02",
         "endDate":"2019-07-04",
         "employees":["emp3","emp4"]
    }
}

Aggregation Query:

POST mysampleindex/_search
{
  "size": 0, 
  "aggs": {
    "mydates": {
      "date_histogram": {
        "field": "project.startDate",
        "interval": "day",
        "format" : "yyyy-MM-dd"
      },
      "aggs": {
        "emp_count": {
          "value_count": {
            "field": "project.employees"
          }
        }
      }
    }
  }
}

Note that I've used date histogram aggregation with day as interval along with the value_count aggregation on employees as its child aggregation.

Query Result:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "mydates" : {
      "buckets" : [
        {
          "key_as_string" : "2019-07-01",
          "key" : 1561939200000,
          "doc_count" : 1,
          "emp_count" : {                        <---- Count of employees
            "value" : 2
          }
        },
        {
          "key_as_string" : "2019-07-02",
          "key" : 1562025600000,
          "doc_count" : 1,
          "emp_count" : {                         <---- Count of employees
            "value" : 2
          }
        }
      ]
    }
  }
}

You can count the employees in similar way for endDate(just replace startDate with endDate in above aggregation query). Once you have both the results, you could perform the addition of what you are looking for in your service layer.

Although technically this doesn't, I just hope this helps!

Upvotes: 2

Related Questions