Ranjith Sekar
Ranjith Sekar

Reputation: 1932

how to read hive conf variables in UDF initialize method

I am trying to read a hive conf variable in initialize method, but not works, any suggestion plz?

My UDF Class:

public class MyUDF extends GenericUDTF {
    MapredContext _mapredContext;

    @Override
    public void configure(MapredContext mapredContext) {
      _mapredContext = mapredContext;
      super.configure(mapredContext);
    }

    @Override
    public StructObjectInspector initialize(ObjectInspector[] args) throws UDFArgumentException {
      Configuration conf = _mapredContext.getJobConf();
    // i am getting conf as null 
    }
}

Upvotes: -1

Views: 965

Answers (2)

Gyanendra Dwivedi
Gyanendra Dwivedi

Reputation: 5557

Probably its too late to answer this question, but for others below is the answer inside a GenericUDF evaluate() method:

@Override
public Object evaluate(DeferredObject[] args) throws HiveException {
    String myconf;
    SessionState ss = SessionState.get();
    if (ss != null) {
        HiveConf conf = ss.getConf();
        myconf= conf.get("my.hive.conf");
        System.out.println("sysout.myconf:"+ myconf);
    }
}

The code is tested on hive 1.2

You should also override configure method to support MapReduce

@Override
    public void configure(MapredContext context) {
        ...................
        ........................
        JobConf conf = context.getJobConf();
            if (conf != null) {
              String myhiveConf = conf.get("temp_var");
            }
        }
    }

To test the code:

  1. Build UDF Jar
  2. On hive CLI, execute the below commands:

    SET hive.root.logger=INFO,console;
    SET my.hive.conf=test;
    ADD JAR /path/to/the/udf/jar;
    CREATE TEMPORARY FUNCTION test_udf AS com.example.my.udf.class.qualified.classname';
    

Upvotes: 1

ryanbwork
ryanbwork

Reputation: 2153

I was also running into this issue with a custom UDTF. It seems that the configure() method is not called on the user defined function until the MapredContext.get() method returns a non-null result (see UDTFOperator line 82 for example). MapredContext.get() likely returns a null result because the hive job has yet to spin up the mappers/reducers (you can see that MapredContext.get() will return null up until the MapredContext.init() method has been called; the init() method takes boolean isMap as a param, so this method doesn't get called until MR/Tez runtime - the comment associated with the GenericUDTF.configure() method confirms this).

TLDR the UDF/UDTF initialize() method will be called during job setup, and the configure() will be called at MR runtime, hence the null result in your example code.

Upvotes: -1

Related Questions