Clay
Clay

Reputation: 2734

Change Python version Livy uses in an EMR cluster

I am aware of Change Apache Livy's Python Version and How do i setup Pyspark in Python 3 with spark-env.sh.template.

I also have seen the Livy documentation

However, none of that works. Livy keeps using Python 2.7 no matter what.

This is running Livy 0.6.0 on an EMR cluster.

Note: This works without any issues in another EMR cluster running Livy 0.7.0 I have gone over all of the settings on the other cluster and cannot find what is different. I did not have to do any of this on the other cluster, Livy just used python3 by default.

How exactly do I get Livy to use python3 instead of python2?

Upvotes: 2

Views: 1548

Answers (1)

Clay
Clay

Reputation: 2734

Finally just found an answer after posting.

I ran the following in a PySpark kernel Jupyter session cell before running any code to start the PySpark session on the remote EMR cluster via Livy.

%%configure -f
{ "conf":{
          "spark.pyspark.python": "python3"
         }
}

Simply adding "spark.pyspark.python": "python3"  to the .sparkmagic config.json or config_other_settings.json also worked.

Confusing that this does not match the official Livy documentation.

Upvotes: 1

Related Questions