I have a py script i want to run in aws glue, but after i run it from etl jobs these errors arrise. any idea how to solve?

Question

I want to run an sql function from python and i want to add it to aws glue etl job. this is my code

import sys
from awsglue.context import GlueContext
from pyspark.context import SparkContext
import psycopg2

# Initialize Spark and Glue contexts
sc = SparkContext()
glueContext = GlueContext(sc)

# Initialize variables
connection = None
cur = None

try:
    # Establish connection
    connection = psycopg2.connect(
        host="x",
        port="y",
        database="postgres",
        user="z",
        password="w"
    )

    # Create a cursor object using the connection
    cur = connection.cursor()

    # Execute the function with the schema name included
    cur.execute("SELECT api_backend.test_etl2()")

    # Commit the transaction
    connection.commit()

except Exception as e:
    print("Error:", e)

finally:
    # Close the cursor and connection
    if cur:
        cur.close()
    if connection:
        connection.close()

i have added the pyspark.zip and psycopg.zip to library path now it says ModuleNotFoundError: No module named 'awsglue.context'

how to run this from aws glue, i need help

first i got ModuleNotFoundError: No module named 'psycopg' then i made a zip file then uploaded to s3 bucket then i got ModuleNotFoundError: No module named 'pyspark' then i made a zip file of pyspark then uploaded to s3 bucket now it say ModuleNotFoundError: No module named 'awsglue.context'

whats the correct way to do it

I have a py script i want to run in aws glue, but after i run it from etl jobs these errors arrise. any idea how to solve?

Answers (0)

Related Questions