Reputation: 8894
I'm trying to test a utility function which takes in a spark DataFrame and outputs a new spark DataFrame after some transformations. In trying to mock data for the test, I need to construct a pyspark dataframe for the input. Most examples I see of this use
spark.createDataFrame(data, columns)
I'm not too familiar with the docs and cannot find "spark". How do you from pyspark* import spark
?
Upvotes: 0
Views: 2558
Reputation: 13551
I think you are looking for a way how to get the spark
session variable, right?
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.master("local") \
.getOrCreate()
You can modify the session builder with several options.
Upvotes: 1