Reputation: 2312
I am interested in performing Big Data Geospatial analysis on Apache Spark. My data is stored in Azure data lake, and I am restricted to use Azure Databricks. Is there anyway to download Geomesa on Databrick? Moreover, I would like to use the python api; what should I do?
Any help is much appreciated!!
Upvotes: 2
Views: 2133
Reputation: 36
CCRi (backers of geomesa) has generated spark runtime friendly build. A shaded fat jar for GeoMesa (current version is 3.3.0) is available at the maven coordinates org.locationtech.geomesa:geomesa-gt-spark-runtime_2.12:3.3.0
which for Databricks. Since it is shaded, users can add maven exclusions to get it to cleanly install which would be "jline:*,org.geotools:*
" added in Databricks library UI without quotes.
Upvotes: 1
Reputation: 1258
You can install GeoMesa Library directly into your Databricks cluster.
1) Select the Libraries option then a new window will open.
2) Select the maven option and click on 'search packages' option
3) Search the required library and select the library/jar version and choose the 'select' option.
Thats it.
After the installation of the library/jar, restart your cluster.
Now import the required classes in your Databricks notebook.
I hope it helps. Happy Coding..
Upvotes: 4
Reputation: 12788
Running GeoMesa within Databricks is not straightforward:
Reference: Use GeoMesa in Databricks
Hope this helps.
Upvotes: 0
Reputation: 1634
As a starting point, without knowing any more details, you should be able to use the GeoMesa filesystem data store against files stored in WASB.
Upvotes: 1