Reputation: 5531
I have an Oozie workflow which requires the ability to use different date formats. For example, assume I'm running the workflow on 16th January 2015 using a property runDate=20150116
in job.properties. I'd want to be able to automatically use the following paths in Oozie actions:
external-file-20150116.csv
and some other data named:
/rootDir/resource/150116/*
The first example would be easy enough, I'd simply refer to:
external-file-${runDate}.csv
but the second example wouldn't be possible.
I can only find Oozie's in-built EL timestamp() function which is no use as it's a fixed format and offers no manipulation. It seems as though using a coordinator would solve the problem as I'd be able to use all of the nice coord
EL functions. However I'll need to run this workflow occasionally on an ad-hoc basis, in which case I'd be using a job.properties file and not a coordinator.
Any suggestions as to how I can manipulate dates without using a coordinator?
Upvotes: 2
Views: 2241
Reputation: 5531
After lots of messing around and research, I've found the following solution. Unlike the other answer it does not require inserting one variable per required date format into the job. My solution is based on using an EL function - basically a UDF but for Oozie.
Create an EL function to allow dates to have their formats modified. EL functions are written in Java, and unlike Hive UDFs do not require any class extension, although any methods that will be called by Oozie should be static.
The code for this method is:
package org.watsonb.elfunctions;
import org.joda.time.DateTime;
import org.joda.time.format.DateTimeFormat;
import org.joda.time.format.DateTimeFormatter;
public class DateEL {
public static String convertDate(String inputDate, String inputDateFormat, String outputDateFormat) {
DateTimeFormatter formatter = DateTimeFormat.forPattern(inputDateFormat);
DateTime dateTime = formatter.parseDateTime(inputDate);
formatter = DateTimeFormat.forPattern(outputDateFormat);
return formatter.print(dateTime);
}
}
Build this class, and place the generated jar file in /var/lib/oozie
on the Oozie server box.
On Ambari's Oozie config page, create or find the oozie.service.ELService.ext.functions.workflow
property in the Custom oozie-site.xml
tab, and add the following (if it already exists, separate each function declaration with a comma):
convertDateEl=org.watsonb.elfunctions.DateEL#convertDate
In this example:
convertDateEl
is the name of the function that will be called within Oozie workflows,org.watsonb.elfunctions.DateEL
is the full class path,convertDate
is the name of the method in the class.If not using Ambari, add the property to oozie-site.xml
.
Restart the Oozie service. The function is now available to any Oozie workflow.
Inside a workflow, call:
${convertDateEl(runDate, "yyyyMMdd", "yy-MM-dd")}
to return a formatted date. For example:
<arg>/output/telephone-records-${convertDate(runDate, "yyyyMMdd", "yy-MM-dd")}.csv</arg>
would, at runtime, turn into:
<arg>/output/telephone-records-12-09-30.csv</arg>
if runDate is 20120930
.
Source: http://blog.cloudera.com/blog/2013/09/how-to-write-an-el-function-in-apache-oozie/ - I found this useful but a bit too verbose.
Upvotes: 1
Reputation: 4832
There are 3 ways to input an oozie job property.
In your use case, you may add something like below to the oozie commandline
-DrunDate=`date +%Y%m%d`
Upvotes: 1