huy
huy

Reputation: 1884

How to run PySpark with installed packages?

Normally, when I run pyspark with graphframes I have to use this command:

pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12

In the first time run this, this will install the packages graphframes but not the next time. In the .bashrc file, I have already added:

export SPARK_OPTS="--packages graphframes:graphframes:0.8.1-spark3.0-s_2.12"

But I cannot import the packages if I am not adding the option --packages.

How can I run pyspark with graphframes with this simple command?

pyspark

Upvotes: 0

Views: 130

Answers (1)

pltc
pltc

Reputation: 6082

you can make a wrapper script like myspark.sh that triggers pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12, that would be the simplest solution.

Upvotes: 1

Related Questions