How to pass variable arguments to a Spark Dataframe using PySpark?

Question

I am using the Crealytics Spark library to read an Excel Workbook into a Spark Dataframe using a Databricks Python notebook.

Hardcoded like this works fine:

df = spark.read.format("com.crealytics.spark.excel")
     .option("useHeader","true")
     .option("dataAddress","'Sheet1'!")
     .load("/FileStore/tables/Test.xlsx")

I would like to read a dynamic list of options from a table into a PySpark structure (such as list or dict) and pass these to the DataFrame as varargs.

However, it fails even when trying to pass in just one option:

test = {"useHeader":"True"}

df = spark.read.format("com.crealytics.spark.excel")
     .option(*test)
     .option("dataAddress","'Sheet'!")
     .load("/FileStore/tables/Test.xlsx")

TypeError: option() takes exactly 3 arguments (2 given)

How to pass variable arguments to a Spark Dataframe using PySpark?

Answers (1)

Related Questions