anonggd
anonggd

Reputation: 31

Can you use PySpark instead of Glue PySpark in AWS Glue?

I find Glue PySpark has it's own little twist to everything like is 'select' is 'select_fields' in Glue PySpark. How can I use spark instead of the Glue version?

Upvotes: 1

Views: 909

Answers (1)

Robert Kossendey
Robert Kossendey

Reputation: 6998

You can just use the SparkSession directly instead of the GlueContext wrapper:

from pyspark.context import SparkContext
from awsglue.context import GlueContext

sc = SparkContext()
gc = GlueContext(sc.getOrCreate())
spark = gc.spark_session

df = spark.read.format(...).load(...)

df.select("*").show()

Upvotes: 2

Related Questions