How do I write messages to the output log on AWS Glue?

Jesse Clark picture Jesse Clark · Feb 21, 2018 · Viewed 17.4k times · Source

AWS Glue jobs log output and errors to two different CloudWatch logs, /aws-glue/jobs/error and /aws-glue/jobs/output by default. When I include print() statements in my scripts for debugging, they get written to the error log (/aws-glue/jobs/error).

I have tried using:

log4jLogger = sparkContext._jvm.org.apache.log4j 
log = log4jLogger.LogManager.getLogger(__name__) 
log.warn("Hello World!")

but "Hello World!" doesn't show up in either of the logs for the test job I ran.

Does anyone know how to go about writing debug log statements to the output log (/aws-glue/jobs/output)?

TIA!

EDIT:

It turns out the above actually does work. What was happening was that I was running the job in the AWS Glue Script editor window which captures Command-F key combinations and only searches in the current script. So when I tried to search within the page for the logging output it seemed as if it hadn't been logged.

NOTE: I did discover through testing the first responder's suggestion that AWS Glue scripts don't seem to output any log message with a level less than WARN!

Answer

Alexey Bakulin picture Alexey Bakulin · Feb 22, 2018

Try to use built-in python logger from logging module, by default it writes messages to standard output stream.

import logging

MSG_FORMAT = '%(asctime)s %(levelname)s %(name)s: %(message)s'
DATETIME_FORMAT = '%Y-%m-%d %H:%M:%S'
logging.basicConfig(format=MSG_FORMAT, datefmt=DATETIME_FORMAT)
logger = logging.getLogger(<logger-name-here>)

logger.setLevel(logging.INFO)

...

logger.info("Test log message")