Charles Menguy
Charles Menguy

Reputation: 41428

Custom Processing on Python logging messages in a generic way

I am trying to figure out the best approach to apply some custom processing on Python logging messages with minimal impact to our codebase.

The problem is this: we have many different projects logging a lot of things, and among these can be found some AWS keys. As a security requirement, we need to strip out all AWS keys from the logs, and there are multiple ways to go about this:

I have done some research on approach #3 but couldn't really find a way to do this. Does anyone have experience applying some custom processing on logging messages that would apply to this use case?

Upvotes: 3

Views: 888

Answers (1)

Vinay Sajip
Vinay Sajip

Reputation: 99365

You could use a custom LogRecord class to achieve this, as long as you could identify keys in text unambiguously. For example:

import logging
import re

KEY = 'PK_SOME_PUBLIC_KEY'
SECRET_KEY = 'SK_SOME_PRIVATE_KEY'

class StrippingLogRecord(logging.LogRecord):

    pattern = re.compile(r'\b[PS]K_\w+\b', re.I)

    def getMessage(self):
        message = super(StrippingLogRecord, self).getMessage()
        message = self.pattern.sub('-- key redacted --', message)
        return message

if hasattr(logging, 'setLogRecordFactory'):
    # 3.x has this
    logging.setLogRecordFactory(StrippingLogRecord)
else:
    # 2.x needs monkey-patching
    logging.LogRecord = StrippingLogRecord

logging.basicConfig(level=logging.DEBUG)
logging.debug('Message with a %s', KEY)
logging.debug('Message with a %s', SECRET_KEY)

In my example I've assumed you could use a simple regex to spot keys, but a more sophisticated alternative method could be used if that's not workable.

Note that the above code should be run before any of the code which logs keys.

Upvotes: 5

Related Questions