user3276159
user3276159

Reputation: 209

Running PySpark statements in AWS SageMaker Studio Lab

Folks: Apologies if this is a very basic question. I am trying out SageMaker Studio Lab https://studiolab.sagemaker.aws

After creating a new Notebook, I noticed there are choices of two kernels:

However neither of these seem to support PySpark coding.

When I attempt to set up a PySpark session, I get an error about the JAVA_HOME not being set.

import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder \
    .master("local") \
    .appName("My ML Application") \
    .config("fs.s3a.endpoint", "s3.amazonaws.com") \
    .getOrCreate()


JAVA_HOME is not set

Is there any way to code in PySpark using SageMaker Studio Lab ?

Upvotes: 0

Views: 676

Answers (1)

Shubham Agrawal
Shubham Agrawal

Reputation: 13

I was facing the same problem and it was resolved by this answer: https://repost.aws/questions/QUIruPbWNHQ2iqZsDZEj41hA/java-not-found-when-running-sagemaker-studio-python-notebooks#ANENCXCwUIQ_6S1QBhrrrw1w

Since I wanted to run some scripts via terminal in Sagemaker, I copied the two yum commands and run them in the Sagemaker terminal and it set JAVA_HOME for me.

Upvotes: 0

Related Questions