Admtech
Admtech

Reputation: 61

ArangoDB Collect Sort and Limit

I am looking for a solution to what should be a simple issue. I come from a MySQL development and still have problems here and a with AQL. I want to display the comments of a user with page pagnation without duplicate article keys. I have a comment collection with

_key, user_key, content_key, comment, stamp_create 

Example:

1, 4483, 200, "reply 1", "2021-04-24T14:55:55+00:00"
2, 4483, 200, "reply 2", "2021-04-24T14:56:23+00:00"
3, 4483, 201, "reply 1", "2021-04-23T17:10:15+00:00"
4, 4483, 202, "reply 1", "2021-04-23T12:30:35+00:00"
5, 4483 , 202, "reply 2", "2021-04-22T23:50:51+00:00"

Now I need the output all comments Sorted by stamp_create DESC but with summarized content_key, With Mysql a "Group By" was enough for this. So I try it with Collect.

FOR c IN comments FILTER c.user_key =="4483" SORT c stamp_create DESC COLLECT contentKeys = c.content_key RETURN contentKeys LIMIT 0,3

Generally the result is correct, but the sorting doesn't work because COLLECT doesn't remember it. Well then with the good old DISTINCT:

FOR c IN comments FILTER c.user_key =="4483" SORT c stamp_create DESC RETURN DISTINCT contentKeys LIMIT 0,3

Result looks good and the sorting is correct, but I get only 2 results and not 3 as specified in Limit. For a page pagnation unfortunately useless, because I get more or less results (MySQL does it differently with Group By).

Next try now with a little more effort:

FOR c IN comments
    FILTER c.user_key =="4483"
    COLLECT contentKeys = c.content_key INTO groups
    LET keys = first(FOR value IN groups[*].c SORT value.stamp_create DESC RETURN {key: value._key, stamp: value.stamp_create})
    SORT keys.stamp DESC
    LIMIT 0,3
    RETURN keys

I load here all comments grouped in "groups", then sort them within the group by and then again in the whole.

The result is correct and I also get 3 results. Big disadvantage that is not really fast and I have many users with more than 60K comments.

The entire (filtered) data set (can also be 60K long) must be read in and then re-sorted. This must be easier to do.

So my question is, is there a performant way to make this faster and easier?

thanks

Upvotes: 1

Views: 728

Answers (1)

Admtech
Admtech

Reputation: 61

With the help of mpoeter, we found a faster and more elegant method.

Here is the solution:

FOR c IN comments
FILTER c.user_key =="4483"
COLLECT contentKeys = c.content_key AGGREGATE maxStamp = MAX(c.stamp_create) OPTIONS { method: "sorted" }
SORT maxStamp DESC
LIMIT 0,3
RETURN contentKeys

Thanks to the ArangoDB team :-)

Upvotes: 2

Related Questions