Reputation: 2371
In oozie, input-events
are pretty straightforward, if the specifies file/folder is not present, the coordinator job is kept in WAITING
state. But I could not understand what output-events
does.
As per my understanding, the files/folders specified in output-events
tag should be created by oozie in case all specified actions are successful. But that does not happen. I cannot find any relevant logs either. Nor are the documentations clear about this.
So, the question is, does Oozie really create files/folders specified in output-events
? Or does it just mention that these particular files/folders are created during the workflow and the responsibility of creation is on jobs, not on Oozie?
Relevant piece of code can be found at https://gist.github.com/venkateshshukla/de0dc395797a7ffba153
Upvotes: 5
Views: 1220
Reputation: 886
The official Oozie documentation for Oozie Coordinator is not very clear on the exact purpose of the output-events
element. However, the book "Apache Oozie: The Workflow Scheduler for Hadoop" mentions the following:
During reprocessing of a coordinator, Oozie tries to help the retry attempt by cleaning up the output directories by default. For this, it uses the
<output-events>
specification in the coordinator XML to remove the old output before running the new attempt. Users can override this default behavior using the–noCleanup
option.
So, in summary:
output-events
are not automatically created by Oozie, you need to create those files in your Oozie workflow actions.output-events
configuration is for giving Oozie information on what files will be created by your Oozie workflow actions, which Oozie would use to cleanup files when rerunning/reprocessing a coordinator.Upvotes: 6