Sreevisakh
Sreevisakh

Reputation: 1926

Groupby query in elastic search

I have an elastic search cluster having the analytics data of my website. There are page view events when a user visits a page. Each pageview event will have a session-id field, which will remain same during the user session.

I would like to calculate the session duration of each session by grouping the events by session id and calculating the duration different between the first event and the last event

Is there any way I can achieve this with Elastic Search Query?

Pageview events

  [
    { 
      "session-id":"234234-234234-324324-23432432",
      "url": 'testpage1',
      "timestamp":54323424222
    },
    { 
      "session-id":"234234-234234-324324-23432432",
      "url": 'testpage2',
      "timestamp":54323424223
    },
    { 
      "session-id":"234234-234234-324324-23432432",
      "url": 'testpage3',
      "timestamp":54323424224
    }
]

Session duration will be (54323424224 - 54323424222)ms

EDIT:

I was able to create a datatable visualization with sessionid, max timestamp, min stamp, by query min(timestamp) & max(timestamp) for each of the session id. Now all I need is the different between these to aggs.

Upvotes: 1

Views: 284

Answers (2)

Praneeth
Praneeth

Reputation: 761

Stats bucket aggregation will give you information about min and max timestamps per session. You can calculate difference between them(max - min) using bucket script aggregation.

Refer: bucket-script-aggregation and stats-bucket-aggregation.

You can use following query to calculate difference between max and min timestamps per session-id:

{
  "size": 0,
  "aggs": {
    "session": {
      "terms": {
        "field": "session-id.keyword",
        "size": 10
      },
      "aggs": {
        "stats_bucket":{
          "stats":{
            "field": "timestamp"  
          }
        },
        "time_spent": {
          "bucket_script": {
            "buckets_path": {
              "min_stats": "stats_bucket.min",
              "max_stats": "stats_bucket.max"
            },
            "script": "params.max_stats - params.min_stats"
          }
        }
      }
    }
  }
}

Upvotes: 1

jordivador
jordivador

Reputation: 1096

There's no way to compute the difference between max and min inside buckets.

Try with this calculating the difference from min-max in your client-side:

{
  "aggs": {
    "bySession": {
      "terms": {
        "field": "session-id.keyword"
      },
      "aggs": {
        "statsBySession": {
          "stats": {
            "field": "timestamp"
          }
        }
      }
    }
  }
}

Upvotes: 1

Related Questions