Reputation: 355
I want to use Catalyst rules to transform star-schema (https://en.wikipedia.org/wiki/Star_schema) SQL query to SQL query to denormalized star-schema where some fields from dimensions tables are represented in facts table. I tried to find some extension points to add own rules to make a transformation described above. But I didn't find any extension points. So there are the following questions:
Upvotes: 5
Views: 2498
Reputation: 11285
Following @Ambling advice you can use the sparkSession.experimental.extraStrategies
to add your functionality to the SparkPlanner
.
An example strategy that simply prints "Hello world" on the console
object MyStrategy extends Strategy {
def apply(plan: LogicalPlan): Seq[SparkPlan] = {
println("Hello world!")
Nil
}
}
with example run:
val spark = SparkSession.builder().master("local").getOrCreate()
spark.experimental.extraStrategies = Seq(MyStrategy)
val q = spark.catalog.listTables.filter(t => t.name == "five")
q.explain(true)
spark.stop()
You can find a working example project on friend's GitHub: https://github.com/bartekkalinka/spark-custom-rule-executor
Upvotes: 4
Reputation: 446
As a clue, now in Spark 2.0, you can import extraStrategies
and extraOptimizations
through SparkSession.experimental
.
Upvotes: 2