Reputation: 1545
How can I get the distinct number of users for a given time range that have used my service? The number of users must be shown in a CloudWatch dashboard.
I am using Cognito with a hosted UI for user authentication, HTTP API Gateway, with Lambda integration for authorization and the API Gateway requests get handled by another Lambda function.
In the CloudWatch access logs for the API Gateway, I can log the username. I know that I can use stats count(*) by username
in CloudWatch Insights to get a count of how many requests each user has sent to the API Gateway but I don't know how I can get a list of distinct users. The count_distinct
won't work as it will only approximate the users as the field can have high cardinality.
In the end, I want to have a number widget in my CloudWatch dashboard that will show the distinct number of users who have used the service within the selected time range.
Upvotes: 0
Views: 535
Reputation: 1545
Using custom CloudWatch dashboard widgets I've decided to build a Lambda function that executes a log insights query and renders the results as a custom widget.
import os
import boto3
from aws_lambda_powertools import Logger
from aws_lambda_powertools.utilities.data_classes import (
CloudWatchDashboardCustomWidgetEvent,
event_source,
)
from aws_lambda_powertools.utilities.typing import LambdaContext
LOG_GROUP_NAME = os.environ["LOG_GROUP_NAME"]
logger = Logger()
cloud_watch_logs = boto3.client("logs")
DOCS = """
## User Widget
A script to get the number of unique users accessing the API in a given time range.
"""
CSS = """
<style>
.container {
align-content: center;
align-items: center;
display: flex;
flex-direction: row;
justify-content: center;
width: 100%;
}
.value {
font-size: 45px;
}
</style>"""
def get_unique_api_users(start_time: int, end_time: int) -> int:
start_query_response = cloud_watch_logs.start_query(
logGroupName=LOG_GROUP_NAME,
startTime=start_time,
endTime=end_time,
queryString='filter ispresent(user) and user != "-" | stats count(*) as userCount by user',
limit=10000,
)
response = None
while response == None or response["status"] != "Complete":
response = cloud_watch_logs.get_query_results(
queryId=start_query_response["queryId"]
)
return len(response["results"])
@logger.inject_lambda_context(log_event=False)
@event_source(data_class=CloudWatchDashboardCustomWidgetEvent)
def lambda_handler(event: CloudWatchDashboardCustomWidgetEvent, context: LambdaContext):
if event.describe:
return DOCS
start_time = event.widget_context.time_range.start
end_time = event.widget_context.time_range.end
if event.widget_context.time_range.zoom_start:
start_time = event.widget_context.time_range.zoom_start
end_time = event.widget_context.time_range.zoom_end
return f"""
{CSS}
<div class="container">
<div class="value">
🧑 {get_unique_api_users(start_time=start_time, end_time=end_time)}
</div>
</div>"""
With this approach we ensure to get the exact number of API users. On the downside, getting the number of users will take longer the more logs we query and the more users we have. Also, everytime we refresh the widget a Lambda function gets invoked, counting towards our concurrent execution limit in the region and costing money on each invocation, though it's arguably only very little money.
Upvotes: 1