Stark
Stark

Reputation: 634

How to add Spark configuration in Databricks cluster

I am using a Spark Databricks cluster and want to add a customized Spark configuration.
There is a Databricks documentation on this but I am not getting any clue how and what changes I should make. Can someone pls share the example to configure the Databricks cluster.
Is there any way to see the default configuration for Spark in the Databricks cluster.

Upvotes: 8

Views: 23580

Answers (3)

Mahesh Malpani
Mahesh Malpani

Reputation: 2009

Ideally should be in cluster advance option there is spark configuration.

It can be set from pyspark code as well.

There is policy as well which you can create and your cluster should use it so will install libs and I think configuration also. A

Upvotes: 0

Leonardo Pedroso
Leonardo Pedroso

Reputation: 140

You have many ways to set up the default cluster configs:

  1. Manually in the "compute" tab (as mentioned before): Go to Compute > Select a cluster > Advanced Options > Spark sparkexample

  2. Via notebook (as mentioned before): In a cell of your databricks notebook, you can set any spark configuration for that session/job by running the "spark.conf.set" command like spark.conf.set("spark.executor.memory","4g")

  3. Using JOB CLI API: If you are aiming to deploy jobs programmatically in a multi-environment fashion (e.g. Dev, Staging, Production): databricksjobapi

Useful links!

Upvotes: 0

Joey Gomes
Joey Gomes

Reputation: 66

  1. You can set cluster config in the compute section in your Databricks workspace. Go to compute (and select cluster) > configuration > advanced options: CLuster config under advanced options

  2. Or, you can set configs via a notebook.

    %python spark.conf.set("spark.sql.name-of-property", value)

Upvotes: 2

Related Questions