Reputation: 239
I receive and error while calling udf from within withColumn in Spark using Scala. This error happens while building with SBT.
val hiveRDD = sqlContext.sql("select * from iac_trinity.ctg_us_clickstream")
hiveRDD.persist()
val trnEventDf = hiveRDD
.withColumn("system_generated_id", getAuthId(hiveRDD("session_user_id")))
.withColumn("application_assigned_event_id", hiveRDD("event_event_id"))
val getAuthId = udf((session_user_id:String) => {
if (session_user_id != None){
if (session_user_id != "NULL"){
if (session_user_id != "null"){
session_user_id
}else "-1"
}else "-1"
}else "-1"
}
)
I receive the error which is -
scala:58: No TypeTag available for String
val getAuthId = udf((session_user_id:String) => {
It compiles properly when instead of (session_user_id:String) I use (session_user_id:Any) but fails in runtime as Any is not recognized in Spark. Please let me know how to handle this.
Upvotes: 0
Views: 527
Reputation: 67075
Have you tried being explicit with your types?
udf[String, String]((session_user_id:String)...
Upvotes: 1