Reputation: 1411
I'm having trouble creating the query to plot a running sum of unique users per day.
I have a table of usages. Lets consider this table:
+--------+---------------------+
| UserId | Timestamp |
+--------+---------------------+
| 1 | 2020-6-9T08:00:00Z |
+--------+---------------------+
| 2 | 2020-6-9T09:23:00Z |
+--------+---------------------+
| 3 | 2020-6-9T14:12:00Z |
+--------+---------------------+
| 3 | 2020-6-10T15:12:00Z |
+--------+---------------------+
| 3 | 2020-6-10T18:01:00Z |
+--------+---------------------+
| 4 | 2020-6-10T02:02:00Z |
+--------+---------------------+
The resulting table I would like to produce is a cumulative sum, per day, such that the sum increases only if that user did not report usage previously.
The graph above would produce the following result. The key here is that on day 2020-6-10, User 3 is not included in the sum, since User 3 reported Usage on 2020-6-9. Also, notice User 3 was only report once on 2020-6-10 (each day is unique usage).
+-----------+-------+
| Day | Users |
+-----------+-------+
| 2020-6-9 | 3 |
+-----------+-------+
| 2020-6-10 | 1 |
+-----------+-------+
Upvotes: 0
Views: 805
Reputation: 2744
Please take a look at activity_counts_metrics plugin:
let start = datetime(2020-06-09);
let end = datetime(2020-06-11);
datatable (UserId:int, Timestamp:datetime)
[
1, datetime(2020-6-9T08:00:00Z),
2, datetime(2020-6-9T09:23:00Z),
3, datetime(2020-6-9T14:12:00Z),
3, datetime(2020-6-10T15:12:00Z),
3, datetime(2020-6-10T18:01:00Z),
4, datetime(2020-6-10T02:02:00Z),
]
| evaluate activity_counts_metrics(UserId, Timestamp, start, end, 1d)
| project Timestamp, new_dcount
Upvotes: 4