Reputation: 1281
I ran the /bin/pyspark
to do some practice, but console throws an error as shown in below.
**[dst@localhost bin]$ ./pyspark
Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/02/07 01:45:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/02/07 01:45:41 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to '').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
17/02/07 01:45:41 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '' as a work-around.
17/02/07 01:45:41 WARN spark.SparkConf: Setting 'spark.driver.extraClassPath' to '' as a work-around.
17/02/07 01:45:41 WARN util.Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface eth1)
17/02/07 01:45:41 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
/usr/local/spark/latest/python/pyspark/context.py:194: UserWarning: Support for Python 2.6 is deprecated as of Spark 2.0.0
warnings.warn("Support for Python 2.6 is deprecated as of Spark 2.0.0")
Traceback (most recent call last):
File "/usr/local/spark/latest/python/pyspark/shell.py", line 43, in <module>
spark = SparkSession.builder\
File "/usr/local/spark/latest/python/pyspark/sql/session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "/usr/local/spark/latest/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/local/spark/latest/python/pyspark/sql/utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
**
Therefore, I cannot connect the SparkContext (sc
variable) to make RDD operations. Even I tried to google it but failed to get the appropriate solutions. Could you help me use the pyspark
in a normal way?
(My Spark version is 2.1.0
)
Upvotes: 0
Views: 2410
Reputation: 26
You need to launch your SparkSession with .enableHiveSupport() This error relates to not being able to launch Hive Session.
spark = SparkSession.builder.appName("Application name").enableHiveSupport().getOrCreate()
Upvotes: 1