Reputation: 63022
When running pyspark
1.6.X it comes up just fine.
17/02/25 17:35:41 INFO storage.BlockManagerMaster: Registered BlockManager
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 1.6.1
/_/
Using Python version 2.7.13 (default, Dec 17 2016 23:03:43)
SparkContext available as sc, SQLContext available as sqlContext.
>>>
But after I reset SPARK_HOME
, PYTHONPATH
and PATH
to point to spark 2.x installation, things go south quickly
(a) I have to manually delete a derby metastore_db
each time.
(b) pyspark
does not launch: it hangs after printing these unhappy warnings:
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/02/25 17:32:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/25 17:32:53 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
17/02/25 17:32:53 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
I do not need/care for hive
capabilities: but it may well be they are required in case of spark 2.X. What is the simplest working configuration for hive
to make pyspark 2.X
happy?
Upvotes: 1
Views: 1282
Reputation: 4605
Have you tried the enableHiveSupport function? I had issues with DataFrames when migrating from 1.6 to 2.x, even when I wasn't accessing Hive. Calling that function on the builder solved my problem. (You can also add it to the config.)
If you're using the pyspark shell to provision your Spark context, to enable hive support you'll need to do so via the config. In your spark-defaults.conf
try adding spark.sql.catalogImplementation hive
.
Upvotes: 2