GMan
GMan

Reputation: 464

Java Spark withColumn - custom function

Problem, please give any solutions in Java(not scala or python)

I have a DataFrame with the following data

colA, colB
23,44
24,64

What i want is a dataframe like this

colA, colB, colC
23,44, result of myFunction(23,24)
24,64, result of myFunction(23,24)

Basically i would like to add a column to the dataframe in java, where the value of the new column is found by putting the values of colA and colB through a complex function which returns a string.

Here is what i've tried, but the parameter to complexFunction only seems to be the name 'colA', rather than the value in colA.

myDataFrame.withColumn("ststs", (complexFunction(myDataFrame.col("colA")))).show();

Upvotes: 1

Views: 3471

Answers (1)

mahmoud mehdi
mahmoud mehdi

Reputation: 1590

As suggested in the comments, you should use a User Defined Function. Let's suppose that you have a myFunction method which does the complex processing :

val myFunction : (Int, Int) => String = (colA, colB) => {...}

Then All you need to do is to transform your function into a udf and apply it on the columns A and B :

import org.apache.spark.sql.functions.{udf, col}

val myFunctionUdf = udf(myFunction)
myDataFrame.withColumn("colC", myFunctionUdf(col("colA"), col("colB")))

I hope it helps

Upvotes: 0

Related Questions