Reputation: 3463
I am using Amazon EMR and Hive 0.11. I am trying to create a Hive UDF that will return multiple columns from one UDF call.
For example, I would like to call a UDF like the one below and be returned several (named) columns.
SELECT get_data(columnname) FROM table;
I am having trouble finding documentation of this being done, but have heard it is possible if using a Generic UDF. Does anyone know what needs to be returned from the evaluate() method for this to work?
Upvotes: 4
Views: 4101
Reputation: 679
I just use GenericUDTF.After you write a udf extends of GenericUDTF, your udtf should implements the two important method:initialize and evaluate.
The following is simple example:
public class UDFExtractDomainMethod extends GenericUDTF {
private static final Integer OUT_COLS = 2;
//the output columns size
private transient Object forwardColObj[] = new Object[OUT_COLS];
private transient ObjectInspector[] inputOIs;
/**
*
* @param argOIs check the argument is valid.
* @return the output column structure.
* @throws UDFArgumentException
*/
@Override
public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException {
if (argOIs.length != 1 || argOIs[0].getCategory() != ObjectInspector.Category.PRIMITIVE
|| !argOIs[0].getTypeName().equals(serdeConstants.STRING_TYPE_NAME)) {
throw new UDFArgumentException("split_url only take one argument with type of string");
}
inputOIs = argOIs;
List<String> outFieldNames = new ArrayList<String>();
List<ObjectInspector> outFieldOIs = new ArrayList<ObjectInspector>();
outFieldNames.add("host");
outFieldNames.add("method");
outFieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
//writableStringObjectInspector correspond to hadoop.io.Text
outFieldOIs.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
return ObjectInspectorFactory.getStandardStructObjectInspector(outFieldNames, outFieldOIs);
}
@Override
public void process(Object[] objects) throws HiveException {
try {
//need OI to convert data type to get java type
String inUrl = ((StringObjectInspector)inputOIs[0]).getPrimitiveJavaObject(objects[0]);
URI uri = new URI(inUrl);
forwardColObj[0] = uri.getHost();
forwardColObj[1] = uri.getRawPath();
//output a row with two column
forward(forwardColObj);
} catch (URISyntaxException e) {
e.printStackTrace();
}
}
@Override
public void close() throws HiveException {
}
}
Upvotes: 3