Reputation: 29645
I am interested in using the job_history_summary.py script to create a Task Timeline of my EMR cluster, similar to this (picture from Smith College Hadoop Tutorial 1.1, but apparently from the Yahoo report on the TeraSort experiment.).
It seems that the Hadoop logs are stored on each node, rather than on the central server. Do I need to manually combine the logs? It also seems that the script doesn't actually produce the graph.
Upvotes: 0
Views: 30
Reputation: 3956
You can enable logging and provide s3 bucket. Logs will be zipped and stored in s3 bucket provided.
Upvotes: 1