Reputation: 8586
There are a few questions similar to this, but not quite the same thing. Feels like I'm diving straight into the deep end with Elastic queries and would appreciate some advice.
This is the SQL, and I can't quite figure out how to query this type of aggregation.
SELECT COUNT(DISTINCT session_id), event_type
FROM events
GROUP BY event_type
I'd like to get a count for each event type, but only unique for a given user session.
e.g. Only count the event 'page-view' once for each user session, effectively unique page views.
Upvotes: 3
Views: 7955
Reputation: 16925
If your index mapping defines both the event_type
and the session_id
as keywords
:
PUT /events
{
"mappings": {
"properties": {
"event_type": {
"type": "keyword"
},
"session_id": {
"type": "keyword"
}
}
}
}
you could use the following aggregation query:
POST /events/_search?filter_path=aggregations.*.buckets.key,aggregations.*.buckets.distinct_session_id
{
"size": 0,
"aggs": {
"group_by_event_type": {
"terms": {
"field": "event_type",
"size": 10
},
"aggs": {
"distinct_session_id": {
"cardinality": {
"field": "session_id"
}
}
}
}
}
}
which'd yield:
{
"aggregations" : {
"group_by_event_type" : {
"buckets" : [
{
"key" : "page_view",
"distinct_session_id" : {
"value" : 2
}
},
{
...
}
]
}
}
}
Note that "size": 10
is a default setting essentially equivalent to SQL's LIMIT
clause.
Speaking of SQL, Elasticsearch supports SQL access too, albeit only within an X-Pack license. When you have one, you can do:
GET _sql?format=txt
{
"query": """
SELECT COUNT(DISTINCT session_id) distnct_sessions, event_type
FROM events
GROUP BY event_type
"""
}
which'd yield
distnct_sessions| event_type
----------------+---------------
2.0 |page_view
...
Upvotes: 6