Reputation: 5735
I have two fiends in Elasticsearch
URI
BROWSER
I want to find all those URI
which were not hit from a particular browser, say chrome
I want to write following query
1. Group by URI,
2. find distinct BROWSER set,
3. filter URIs where chrome is not in BROWERS set.
return result.
I have compeleted query for 1st step
{
"aggs": {
"uri_agg": {
"terms": {
"field": "uri.keyword",
"size": 50
},
"aggs": {
"not_hit_by_chrome": {
}
}
}
}
}
I am not getting how to achieve step 2 and 3 in this query.
Upvotes: 0
Views: 828
Reputation: 8860
There are basically two ways to achieve this.
I've simply filtered the documents not having chrome
in the browser
using Bool Query
and post that I've simply made use of two Terms Aggregation
for what you are looking for. That way, it would be more efficient as compared to applying filter on aggregation.
Structure of the query would be:
- Bool Query
- Terms Aggregation (Parent for uri)
- Terms Aggregation (Child for browsers)
Note that I'm assuming both your fields 'uri' and browser
are of type keyword
Sample Documents:
POST myindex/mydocs/1
{
"uri": "www.google.com",
"browser": "chrome"
}
POST myindex/mydocs/2
{
"uri": "www.google.com",
"browser": "firefox"
}
POST myindex/mydocs/3
{
"uri": "www.google.com",
"browser": "iexplorer"
}
Query:
POST myindex/_search
{
"size": 0,
"query": {
"bool": {
"must_not": [
{
"match": {
"browser": "chrome"
}
}
]
}
},
"aggs": {
"myuri": {
"terms": {
"field": "uri",
"size": 10
},
"aggs": {
"mybrowsers": {
"terms": {
"field": "browser",
"size": 10
}
}
}
}
}
}
Response
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"myuri": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "www.google.com",
"doc_count": 2,
"mybrowsers": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "firefox",
"doc_count": 1
},
{
"key": "iexplorer",
"doc_count": 1
}
]
}
}
]
}
}
}
If you are using xpack and if you would like to achieve this via SQL Access, then your query would translate to simple SQL query as below:
POST /_xpack/sql?format=txt
{
"query": "SELECT uri, browser, count(1) FROM myindex WHERE browser <> 'chrome' GROUP BY uri, browser"
}
Response
uri | browser | COUNT(1)
---------------+---------------+---------------
www.google.com |firefox |1
www.google.com |iexplorer |1
Let me know if this helps!
Upvotes: 1