Kaja
Kaja

Reputation: 3057

Running exported databricks notebook in a local spark on a vm

I have installed spark locally with its all dependencies on a vm.

enter image description here

Now I want to run an exported databricks notebook in this environment. Can I achieve this aim without using jupyter?

Upvotes: 0

Views: 223

Answers (1)

PieCot
PieCot

Reputation: 3639

You can export a Databricks notebook as a "plain" python file (File > Export menu) containing the code from all the cells.

Databricks use comments in this file to reconstruct the notebook structure if you reimport it, but, being comments, they are skipped by the Python interpreter if you run the file as a script locally.

The only thing missing is a spark session (spark variable in the notebook code). You can add the following lines at the beginning of this file in order to have a spark instance:

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

and then use spark-submit in your local environment:

spark-submit your_notebook.py

Mind that this approach works only for very simple notebooks, without anything that is specific to Databricks functionalities or utils.

Upvotes: 1

Related Questions