Reputation: 124
Is there any way by which I can use UDF's created in pyspark into Java Spark job
I know there is a way to use Java UDF into pyspark, but I am looking for other way round
Upvotes: 0
Views: 283
Reputation: 707
First, I have to say that I don’t recommend you to do that. It sounds like a huge latency for the UDF, and I really suggest you to try write the UDF in Scala / Java.
If you still want to do that, here is how: you should write a UDF that creates a Python interpreter and executes your code. Here is a Scala code example:
System.setProperty("python.import.site", "false")
val interpreter = new PythonInterpreter
interpreter.exec("from __builtin__ import *")
// execute a function that takes a string and returns its length
val someFunc = interpreter.get("len")
val result = someFunc.__call__(new PyString("Test!"))
val realResult = result.__tojava__(classOf[Integer]).asInstanceOf[Int]
print(realResult)
This code call the len
Python function and returns its result on the string "Test!"
.
I really think it’ll cause a bad performance for your job, and you should reconsider this plan again.
Upvotes: 1