Reputation: 900
I'm experimenting with EMR a bit I try to run a very simple spark programm
from pyspark.sql.types import IntegerType
mylist = [1, 2, 3, 4]
df = spark.createDataFrame(mylist, IntegerType()).show()
df.write.parquet('/path/to/save', mode='overwrite')
I launch the app by adding a step in the AWS EMR web-console
I select the app from s3
select deploy mode cluster
and leave the rest blank.
The app doesn't even launch probably because I get the following error code:
Application application_1564485869414_0002 failed 2 times due to AM Container for appattempt_1564485869414_0002_000002 exited with exitCode: 13
what am I doing wrong here?
Upvotes: 0
Views: 1658
Reputation: 184
Your spark
variable isn't defined in the code you tried. It might be causing the issue since you are not passing a spark context to the app.
Try adding:
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.getOrCreate()
Before using spark.createDataFrame(...)
Upvotes: 2