user3279189
user3279189

Reputation: 1653

Hadoop's map join equivalent in spark sql

I'm looking for Hadoop's mapjoin equivalent in Spark and I could find this spark.sql.autoBroadcastJoinThreshold

  1. Does it work fine with spark SQL ? I tried but it did not seem to have effect as shuffle read / write was same even if I apply the parameter or not.

I set this value and ran my query sqlContext.sql("SET spark.sql.autoBroadcastJoinThreshold=100000000;")

  1. Is there any other equivalent concept in SPARK-SQL ?

Thanks ..

Upvotes: 1

Views: 1282

Answers (1)

Spiro Michaylov
Spiro Michaylov

Reputation: 3571

  1. This was introduced in Spark 1.1.0.
  2. It is tested (a little bit) in the Spark test suite -- see PlannerSuite.
  3. Your SET query is cheerfully and silently swallowed by versions of Spark that don't support it -- I just tried it with 1.0.2.

Upvotes: 0

Related Questions