CODEWITHSUNDEEP

apache-sparkapache-spark-sql

Reputation: 1653

Hadoop's map join equivalent in spark sql

I'm looking for Hadoop's mapjoin equivalent in Spark and I could find this spark.sql.autoBroadcastJoinThreshold

Does it work fine with spark SQL ? I tried but it did not seem to have effect as shuffle read / write was same even if I apply the parameter or not.

I set this value and ran my query sqlContext.sql("SET spark.sql.autoBroadcastJoinThreshold=100000000;")

Is there any other equivalent concept in SPARK-SQL ?

Thanks ..

Upvotes: 1

Views: 1282

Answers (1)

Spiro Michaylov

Spiro Michaylov

Reputation: 3571

This was introduced in Spark 1.1.0.
It is tested (a little bit) in the Spark test suite -- see PlannerSuite.
Your SET query is cheerfully and silently swallowed by versions of Spark that don't support it -- I just tried it with 1.0.2.

Upvotes: 0

Related Questions