Reputation: 4120
I am storing clickstream events in graph database using the below structure
User perform multiple events and each event has a edge towards previous event:
Each event has a property named referer. For eg, if a user views a page www.foobar.com/aaa then there will be a page view event and it will have referer:www.foobar.com/aaa
Now I want to find the possible paths from homepage with their count
Using the below Gremlin query I am able to find the possible paths, but I am not able to group them to find counts of each path:
g.V().hasLabel('event').has('referer','https://www.foobar.com/').in('previous').in('previous').path().by('referer')
Output:
[
{
"labels": [
[],
[],
[]
],
"objects": [
"https://www.foobar.com/",
"https://www.foobar.com/aaa",
"https://www.foobar.com/bbb"
]
},
{
"labels": [
[],
[],
[]
],
"objects": [
"https://www.foobar.com/",
"https://www.foobar.com/aaa",
"https://www.foobar.com/bbb"
]
},
{
"labels": [
[],
[],
[]
],
"objects": [
"https://www.foobar.com/",
"https://www.foobar.com/ccc",
"https://www.foobar.com/ddd"
]
}
]
I want an output like this:
[[
"https://www.foobar.com/",
"https://www.foobar.com/aaa",
"https://www.foobar.com/bbb"
]:2,
[
"https://www.foobar.com/",
"https://www.foobar.com/ccc",
"https://www.foobar.com/ddd"
]:1]
Since I am using azure cosmos graph db only these gremlin operators are available
https://learn.microsoft.com/en-us/azure/cosmos-db/gremlin-support
Thanks
Upvotes: 1
Views: 512
Reputation: 14371
You can apply groupCount
to a path
using a syntax such as this:
groupCount().by(path().by('referer'))
So you could rewrite your query as:
g.V().hasLabel('event').
has('referer','https://www.foobar.com/').
in('previous').
in('previous').
groupCount().by(path().by('referer'))
Hope this helps,
Cheers Kelvin
Upvotes: 0