Reputation: 1
Trying to get PySpark to work with Spark Connect, not sure if this is supported.
spark = SparkSession.builder.remote("sc://localhost:15002").getOrCreate()
print("Connected to Spark Connect")
everyone = spark.sql("SELECT id, name FROM people")
everyone.show()
Seems spark.sql or SQLContext or spark.sparkContext are all not supported when using Spark Connect.
One ends up with the following error on line spark.sql:
Connected to Spark Connect
Traceback (most recent call last):
File "H:\portable\Python311\Lib\site-packages\pyspark\sql\connect\client\core.py", line 1457, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(
File "H:\portable\Python311\Lib\site-packages\pyspark\sql\connect\client\core.py", line 1434, in _execute_and_fetch_as_iterator
self._handle_error(error)
File "H:\portable\Python311\Lib\site-packages\pyspark\sql\connect\client\core.py", line 1704, in _handle_error
self._handle_rpc_error(error)
File "H:\portable\Python311\Lib\site-packages\pyspark\sql\connect\client\core.py", line 1779, in _handle_rpc_error
raise convert_exception(
pyspark.errors.exceptions.connect.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near end of input.(line 1, pos 0)
== SQL ==
^^^
python-BaseException
Tried using spark.sql or SQLContext or spark.sparkContext.
Expectation is for PySpark to work with Spark connect and perform predicate push-downs.
Anyone know how to resolve this? Thanks.
Upvotes: 0
Views: 57