Reputation: 1
I want to run an sql function from python and i want to add it to aws glue etl job. this is my code
import sys
from awsglue.context import GlueContext
from pyspark.context import SparkContext
import psycopg2
# Initialize Spark and Glue contexts
sc = SparkContext()
glueContext = GlueContext(sc)
# Initialize variables
connection = None
cur = None
try:
# Establish connection
connection = psycopg2.connect(
host="x",
port="y",
database="postgres",
user="z",
password="w"
)
# Create a cursor object using the connection
cur = connection.cursor()
# Execute the function with the schema name included
cur.execute("SELECT api_backend.test_etl2()")
# Commit the transaction
connection.commit()
except Exception as e:
print("Error:", e)
finally:
# Close the cursor and connection
if cur:
cur.close()
if connection:
connection.close()
i have added the pyspark.zip and psycopg.zip to library path now it says ModuleNotFoundError: No module named 'awsglue.context'
how to run this from aws glue, i need help
first i got ModuleNotFoundError: No module named 'psycopg' then i made a zip file then uploaded to s3 bucket then i got ModuleNotFoundError: No module named 'pyspark' then i made a zip file of pyspark then uploaded to s3 bucket now it say ModuleNotFoundError: No module named 'awsglue.context'
whats the correct way to do it
Upvotes: 0
Views: 66