John
John

Reputation: 11921

How to configure Flink cluster for logging via web ui?

I have a Flink cluster set up and I'd like to be able to view the logs and stdout for the JobManager and TaskManagers. When I go to the web ui, I see the following error messages on the respective tabs:

JobManager:
    Logs
        (log file unavailable)
    Stdout
        (stdout file unavailable)

TaskManager
    Logs
        Fetching TaskManager log failed.
    Stdout
        Fetching TaskManager log failed.

I can see that there are some config parameters that could be set, notably taskmanager.log.path, job manager.web.log.path and env.log.dir. However, there is no mention of whether these should be network accessible paths or are they local paths etc.

What do I need to do to be able to view task manager and job manager logs?

Upvotes: 1

Views: 10949

Answers (2)

donhector
donhector

Reputation: 925

What I've found is that if you are running the official Flink docker container (https://hub.docker.com/_/flink), it by default spits everything to the console (i.e docker best practice generally speaking I guess). Thus, the log4j config that seems relevant to adjust is /opt/flink/conf/log4j-console.properties. This is the case for both jobamanger(s) and taskmanager(s).

Thus I've configured that file to not just spit to console but also to file (a rolling one in my case):

log4j-console.properties:

    log4j.rootLogger=INFO, console, file
    # Uncomment this if you want to _only_ change Flink's logging
    #log4j.logger.org.apache.flink=INFO
    # The following lines keep the log level of common libraries/connectors on
    # log level INFO. The root logger does not override this. You have to manually
    # change the log levels here.
    log4j.logger.akka=INFO
    log4j.logger.org.apache.kafka=INFO
    log4j.logger.org.apache.hadoop=INFO
    log4j.logger.org.apache.zookeeper=INFO
    # Log all infos to the console
    log4j.appender.console=org.apache.log4j.ConsoleAppender
    log4j.appender.console.layout=org.apache.log4j.PatternLayout
    log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
    # Log all INFOs to the given rolling file
    log4j.appender.file=org.apache.log4j.RollingFileAppender
    log4j.appender.file.file=/opt/flink/log/output.log
    log4j.appender.file.MaxFileSize=5MB
    log4j.appender.file.MaxBackupIndex=5
    log4j.appender.file.append=true
    log4j.appender.file.layout=org.apache.log4j.PatternLayout
    log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
    # Suppress the irrelevant (wrong) warnings from the Netty channel handler
    log4j.logger.org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline=ERROR, console, file

The above combined with the flink-conf.yaml below was able to display the jobmanager's log in the Jobmanager's Log Tab, and display the taksmanager's log in the Taskmanager's Log Tab.

flink-conf.yaml:

    # General configuration
    taskmanager.data.port: 6121
    taskmanager.rpc.port: 6122
    jobmanager.rpc.port: 6123
    blob.server.port: 6124
    query.server.port: 6125
    jobmanager.rpc.address: <your location>
    jobmanager.heap.size: 1024m
    taskmanager.heap.size: 1024m
    taskmanager.numberOfTaskSlots: 1
    web.log.path: /opt/flink/log/output.log
    taskmanager.log.path: /opt/flink/log/output.log

NOTE: I'm on Flink 1.8.0, running a small cluster in Kubernetes (i.e. separate pods for the jobmanager and taskmanagers)

Upvotes: 3

F30
F30

Reputation: 1212

The JobManager web UI requests the TaskManager logs remotely, so these don't have to reside on a shared file system. The JobManager logs, on the other hand, appear to get read from the local file system.

With the default log4j.properties, all log files get written to the path specified by the log.file property. With the default Flink start script, the directory in this property is controlled by the env.log.dir config option (through the FLINK_LOG_DIR variable).

taskmanager.log.path only appears to get used when logs are requested from a TaskManager by the JobManager. However, there is a fallback to log.file if it's unset, which should cause the right directory to get used automatically. Similarly, jobmanager.web.log.path doesn't even get used at all when log.file is set.

Therefore, I don't think taskmanager.log.path and jobmanager.web.log.path are relevant for a production deployment and can't tell how they're supposed to be used (see my corresponding Flink bug report). You may set env.log.dir to control the log file location, which in principle should also work with the web UI.

Upvotes: 0

Related Questions