Reputation: 425
I'm doing a nested aggregation, per year, and then per week each year in elasticsearch. Leap years have 53 weeks, but the result from ElasticSearch gives the last week of a leap year key="1" and not "53". How can I make ElasticSearch return 53 in stead of 1 for the last week?
Here is my query:
GET _search
{
"size": 0,
"aggs": {
"activities_per_year": {
"date_histogram": {
"field": "start",
"interval": "1y",
"format": "yyyy"
},
"aggs": {
"activities_per_week": {
"date_histogram": {
"field": "start",
"interval": "week",
"format": "w"
}
}
}
}
}
}
And the result (removed data in the middle):
"key_as_string": "2008",
"key": 1199145600000,
"doc_count": 872,
"activities_per_week": {
"buckets": [
{
"key_as_string": "1",
"key": 1199059200000,
"doc_count": 6
},
{
"key_as_string": "2",
"key": 1199664000000,
"doc_count": 5
},
{
"key_as_string": "3",
"key": 1200268800000,
"doc_count": 15
}, {
"key_as_string": "51",
"key": 1229299200000,
"doc_count": 18
},
{
"key_as_string": "52",
"key": 1229904000000,
"doc_count": 7
},
{
"key_as_string": "1",
"key": 1230508800000,
"doc_count": 1
}
]
2008 is a leap year, and the last week has "key_as_string": "1". I want this to be 53, so I can add it to my dictionary :) How can I do this?
Also, elasticsearch returns two weeks with "key_as_string": "1" for year 2013, and I don't think 2013 is a leap year?
Upvotes: 1
Views: 1168
Reputation: 52368
This has some subtle gotchas that one needs to be aware of. First of all, Elasticsearch uses Joda Time API for date-time related stuff.
Secondly, take a look at this explanation of what actually is a "week":
A week based year is one where dates are expressed as a day of week, week number and year (week based). The following description is of the ISO8601 standard used by implementations of this method in this library.
Weeks run from 1 to 52-53 in a week based year. The first day of the week is defined as Monday and given the value 1.
The first week of a year is defined as the first week that has at least four days in the year. As a result of this definition, week 1 may extend into the previous year, and week 52/53 may extend into the following year. Hence the need for the year of weekyear field.
For example, 2003-01-01 was a Wednesday. This means that five days, Wednesday to Sunday, of that week are in 2003. Thus the whole week is considered to be the first week of 2003. Since all weeks start on Monday, the first week of 2003 started on 2002-12-30, ie. in 2002.
The week based year has a specific text format. 2002-12-30 (Monday 30th December 2002) would be represented as 2003-W01-1. 2003-01-01 (Wednesday 1st January 2003) would be represented as 2003-W01-3.
So, in your case, you are seeing 29-12-2008 as belonging to week 1, because Dec 29th 2008 is in a week with three days in 2008 and four days in 2009. According to the above rule, that's week #1 from year 2009. And this has nothing to do with leap years. To give you an example, try indexing 31-12-2009 and 31-12-2015. Both will give you week 53 and they are not leap years.
To see these things better I suggest the following format for your aggregation: "format": "x-w---yyyy-MM-dd"
:
{
"size": 0,
"aggs": {
"activities_per_year": {
"date_histogram": {
"field": "start",
"interval": "1y",
"format": "yyyy"
},
"aggs": {
"activities_per_week": {
"date_histogram": {
"field": "start",
"interval": "week",
"format": "x-w---yyyy-MM-dd"
}
}
}
}
}
}
Upvotes: 1