Reputation: 407
I have two datasets and I would like to merge the tables if the element of a column contains the element of the other. How can I do?
val df = df1.join(df2,
df1.col("Complete Name").equalTo(df2.col("Name")))
Into
val df = df1.join(df2,
df1.col("Complete Name").ifContain(df2.col("Name")))
Upvotes: 0
Views: 179
Reputation: 668
What if you do something like this
{
df1.join(df2, df1.col("Complete Name").ifContain(df2.col("Name")), "left_anti)
.union(df2.join(df1, df1.col("Complete Name").ifContain(df2.col("Name")), "left_anti))
}
Didn't test it though.
Upvotes: 0
Reputation: 25980
How about:
Dataset<Row> d1 = datasetFromJsonStrings(listOf("{\n" +
" \"key\": \"name\",\n" +
" \"origin\": \"left\"\n" +
"}"));
Dataset<Row> d2 = datasetFromJsonStrings(listOf("{\n" +
" \"key\": \"complete name\",\n" +
" \"origin\": \"right\"\n" +
"}"));
// [name,left,complete name,right]
List<Row> rows = d1.join(d2, d2.col("key").contains(d1.col("key"))).collectAsList();
Note: I did it in Java out of convenience because my entire codebase is in Java, not Scala.
Upvotes: 2