FEST
FEST

Reputation: 883

Azure Databricks : Geospatial queries with Spark SQL

Currently I have the following:

I want to see if I can improve the "lat between minLat and maxLat and lon between minLon and maxLon" with some spatial lib. One such example I checked was GeoSpark. Issue here is that the current versions of GeoSpark (and GeoSParkSql) only works with spark v2.3 and no supported runtime in databricks works with that version anymore.

Any ideas what I can do?

Note: I can not deviate from SQL at the moment.

Upvotes: 1

Views: 555

Answers (1)

Alex Ott
Alex Ott

Reputation: 87214

GeoSpark joined the Apache Foundation as Apache Sedona project, and version supporting Spark 3.0 was released around 2 weeks ago, so you can use it the same way as GeoSpark.

P.S. To automate registration of functions we can create something like this, compile into jar, and then configure Spark with --conf spark.sql.extensions=...SomeExtensions:

class SomeExtensions extends (SparkSessionExtensions => Unit) {
  def apply(e: SparkSessionExtensions): Unit = {
    e.injectCheckRule(spark => {
      // Setup something
      _ => Unit
    })
  }
}

Upvotes: 1

Related Questions