taher chhabrawala
taher chhabrawala

Reputation: 4120

Find paths from a graph and then count how many times a path occurs in Azure Cosmos DB using Gremlin

I am storing clickstream events in graph database using the below structure enter image description here

User perform multiple events and each event has a edge towards previous event:

Each event has a property named referer. For eg, if a user views a page www.foobar.com/aaa then there will be a page view event and it will have referer:www.foobar.com/aaa

Now I want to find the possible paths from homepage with their count

Using the below Gremlin query I am able to find the possible paths, but I am not able to group them to find counts of each path:

g.V().hasLabel('event').has('referer','https://www.foobar.com/').in('previous').in('previous').path().by('referer')

Output:

 [
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/aaa",
          "https://www.foobar.com/bbb"
        ]
      },
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/aaa",
          "https://www.foobar.com/bbb"
        ]
      },
      {
        "labels": [
          [],
          [],
          []
        ],
        "objects": [
          "https://www.foobar.com/",
          "https://www.foobar.com/ccc",
          "https://www.foobar.com/ddd"
        ]
      }
    ]

I want an output like this:

[[
  "https://www.foobar.com/",
  "https://www.foobar.com/aaa",
  "https://www.foobar.com/bbb"
]:2,
[
  "https://www.foobar.com/",
  "https://www.foobar.com/ccc",
  "https://www.foobar.com/ddd"
]:1]

Since I am using azure cosmos graph db only these gremlin operators are available https://learn.microsoft.com/en-us/azure/cosmos-db/gremlin-support
Thanks

Upvotes: 1

Views: 512

Answers (1)

Kelvin Lawrence
Kelvin Lawrence

Reputation: 14371

You can apply groupCount to a path using a syntax such as this:

groupCount().by(path().by('referer'))

So you could rewrite your query as:

g.V().hasLabel('event').
      has('referer','https://www.foobar.com/').
      in('previous').
      in('previous').
      groupCount().by(path().by('referer'))

Hope this helps,

Cheers Kelvin

Upvotes: 0

Related Questions