nojohnny101
nojohnny101

Reputation: 562

Cloudwatch filter works in console, but not in boto3

I am using the boto3 logs client to find a specific log stream that contains a string. It is fairly simple, it is the "requestID" that lambda prints to the log when it first starts. When I filter on the console with a simple requestID string like "3f2d1c8c-4ddf-4c67-a57f-5cec3e3e8739", it works and returns the correct log stream.

When filtering through the boto3 API, it returns nothing despite the API showing that it fully searched that log stream.

log_events_resp = logs_client.filter_log_events(
        logGroupName='/aws/lambda/my-function-name',
        filterPattern="3f2d1c8c-4ddf-4c67-a57f-5cec3e3e8739"
    )

what am I doing wrong? below is string I want to search for

START RequestId: edca5feb-2c21-4c47-bc7c-562515094058 Version: $LATEST

is the above a special part of the log in where it can't be search via boto3 or something?

Upvotes: 0

Views: 1725

Answers (3)

kanu
kanu

Reputation: 31

There are several weird things going on with boto3 (v 1.26). Here's my code:

    if not next_token:
        response = cloudwatch.filter_log_events(
            logGroupName=log_group,
            logStreamNames=[log_stream],
            filterPattern=f'"{pattern}"',
            startTime=int(start_dt_utc.timestamp() * 1000),
            endTime=int(end_dt_utc.timestamp() * 1000),
            limit=MAX_RESPONSE_LOG_EVENTS, 
        )
    else:
        response = cloudwatch.filter_log_events(
            logGroupName=log_group,
            logStreamNames=[log_stream],
            filterPattern=f'"{pattern}"',
            # startTime=int(start_dt_utc.timestamp() * 1000),
            # endTime=int(end_dt_utc.timestamp() * 1000),
            nextToken=next_token,
            limit=MAX_RESPONSE_LOG_EVENTS,
        )
  1. I always get fewer events than limit. To get ~100 logs I have to make ~10 calls to cloudwatch. So you have to assume the response is paginated not matter what.
  2. If you specify startTime and endTime in subsequent calls (ie, the else block above), the responses are incorrect- half the events are missing.
  3. If you don't specify filterPattern or logStreamNames in subsequent calls, you get far more events than you should.
  4. Filter pattern must be enclosed in double quotes, or else it has no effect.
  5. Special characters are not allowed in the filter pattern. Escaping with \ doesn't work.

Upvotes: 0

nojohnny101
nojohnny101

Reputation: 562

So after much experimentation, I have a working solution that is reliably returning results as I expect.

Here is the working code, then I'll explain:

def search_log_streams(log_group, request_id, start_time):
    func_frame_obj = log.log_function_info(locals())

    log_streams = []

    log_events_resp = logs_clt.filter_log_events(
        logGroupName=log_group,
        filterPattern=f'"{request_id}"',
        startTime=start_time
    )

    log_streams.extend(log_events_resp["events"])

    try:
        while log_events_resp["nextToken"]:
            log_events_resp = logs_clt.filter_log_events(
                logGroupName=log_group,
                filterPattern=f'"{request_id}"',
                startTime=start_time,
                nextToken=log_events_resp["nextToken"]
            )

            logit.debug(f'log_events_resp: {log_events_resp}')

            log_streams.extend(log_events_resp["events"])

    except KeyError:
        logit.debug(f'no continuation token, all logs streams have been searched')

    log.log_function_results(func_frame_obj, log_streams)
    return log_streams

This function takes in a log_group name, the request_id, and start_time (expressed in milliseconds since epoch, as defined in the docs [see here])

I'm not sure if double wrapping the filterPattern value is what has helped, or if it is finally getting the startTime right. This took a lot of experimentation in order to figure out how to estimate the timestamp of the log I was looking for (the one that contained the request_id) from the event data that was feeding this lambda.

I appreciate the help and time @Jonathan Leon

Upvotes: 1

Jonathan Leon
Jonathan Leon

Reputation: 5648

I misread your search criteria as a logstream name, rather than string inside the log message. Here's an update for you to consider. For some reason, I run this and get an empty response and then run it a few more times and it returns a response. Very odd, but it does work.

Also note I could only get it to work with at least the logStreamNamePrefix below. When I left it out, I couldn't get a response.

client = boto3.client('logs')
logGroupName = '/your/loggroup'
logStreamNamePrefix = '2020/12/'
client.filter_log_events(
        logGroupName=logGroupName,
        logStreamNamePrefix=logStreamNamePrefix,
        filterPattern='7361a40d-2250-4b9a-9780-0f9feac0bb9'   
        )

Abbreviated output:

{'events': [],
 'searchedLogStreams': [{'logStreamName': '2020/12/01/[$LATEST]1030c0eafc5a4c17acbc96ad984f773',
   'searchedCompletely': True},
  {'logStreamName': '2020/12/01/[$LATEST]19a7a5a60ce549088745ad2fad97def',
   'searchedCompletely': True},
  {'logStreamName': '2020/12/01/[$LATEST]2ac67cb0bfa24e7880ca1ffb97a32d1',
   'searchedCompletely': True},
  {'logStreamName': '2020/12/01/[$LATEST]34f065d3da0490b816d84f8e16d44d0',
   'searchedCompletely': True},
  {'logStreamName': '2020/12/01/[$LATEST]3e26db9628ec43e88562284119514e2',
   'searchedCompletely': True},

Upvotes: 1

Related Questions