Spark how to use a UDF with a Join

Question

I'd like to use a specific UDF with using Spark

Here's the plan:

I have a table A(10 million rows) and a table B(15 millions rows)

I'd like to use an UDF comparing one element of the table A and one of the table B Is it possible

Here's a a sample of my code. At some point i also need to say that my UDF compare must be greater than 0,9:

DataFrame dfr = df
                .select("name", "firstname", "adress1", "city1","compare(adress1,adress2)")
                .join(dfa,df.col("adress1").equalTo(dfa.col("adress2"))
                        .and((df.col("city1").equalTo(dfa.col("city2"))
                                ...;

Is it possible ?

Spark how to use a UDF with a Join

Answers (1)

Related Questions