Jesse Clark
Jesse Clark

Reputation: 1200

How do I write messages to the output log on AWS Glue?

AWS Glue jobs log output and errors to two different CloudWatch logs, /aws-glue/jobs/error and /aws-glue/jobs/output by default. When I include print() statements in my scripts for debugging, they get written to the error log (/aws-glue/jobs/error).

I have tried using:

log4jLogger = sparkContext._jvm.org.apache.log4j 
log = log4jLogger.LogManager.getLogger(__name__) 
log.warn("Hello World!")

but "Hello World!" doesn't show up in either of the logs for the test job I ran.

Does anyone know how to go about writing debug log statements to the output log (/aws-glue/jobs/output)?

TIA!

EDIT:

It turns out the above actually does work. What was happening was that I was running the job in the AWS Glue Script editor window which captures Command-F key combinations and only searches in the current script. So when I tried to search within the page for the logging output it seemed as if it hadn't been logged.

NOTE: I did discover through testing the first responder's suggestion that AWS Glue scripts don't seem to output any log message with a level less than WARN!

Upvotes: 45

Views: 82073

Answers (7)

Rachana
Rachana

Reputation: 11

If you're just debugging, print() (Python) or println() (Scala) works just fine.

Upvotes: 1

Masi N.
Masi N.

Reputation: 41

This worked for INFO level in a Glue Python job:

import sys

root = logging.getLogger()
root.setLevel(logging.DEBUG)

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
root.addHandler(handler)
root.info("check")

source

Upvotes: 3

Simon77
Simon77

Reputation: 406

Just in case this helps. This works to change the log level.

sc = SparkContext()
sc.setLogLevel('DEBUG')
glueContext = GlueContext(sc)
logger = glueContext.get_logger()
logger.info('Hello Glue')

Upvotes: 8

feechka
feechka

Reputation: 225

I faced the same problem. I resolved it by added logging.getLogger().addHandler(logging.StreamHandler(sys.stdout))

Before there was no prints at all, even ERROR level

The idea was taken from here https://medium.com/tieto-developers/how-to-do-application-logging-in-aws-745114ac6eb7

Another option would be to log to stdout and glue AWS logging to stdout (using stdout is actually one of the best practices in cloud logging).

Update: it works only for setLevel("WARNING") and when prints ERROR or WARING. I didn't find how to manage it for the INFO level :(

Upvotes: 2

RobotCharlie
RobotCharlie

Reputation: 1278

I noticed the above answers are written in python. For Scala you could do the following

import com.amazonaws.services.glue.log.GlueLogger

object GlueApp {
  def main(sysArgs: Array[String]) {
    val logger = new GlueLogger
    logger.info("info message")
    logger.warn("warn message")
    logger.error("error message")
  }
}

You can find both Python and Scala solution from official doc here

Upvotes: 11

Lars
Lars

Reputation: 441

I know the article is not new but maybe it could be helpful for someone: For me logging in glue works with the following lines of code:

# create glue context
glueContext = GlueContext(sc)
# set custom logging on
logger = glueContext.get_logger()
...
#write into the log file with:
logger.info("s3_key:" + your_value)

Upvotes: 44

Alexey Bakulin
Alexey Bakulin

Reputation: 1369

Try to use built-in python logger from logging module, by default it writes messages to standard output stream.

import logging

MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
logger = logging.getLogger(<logger-name-here>)

logger.setLevel(logging.INFO)

...

logger.info("Test log message")

Upvotes: 38

Related Questions