Jean-Baptiste
Jean-Baptiste

Reputation: 11

How to fix "module 'pg8000' has no attribute 'connect'" error in AWS Glue job

I'm trying to set up a daily AWS Glue job that loads data into a RDS PostgreSQL DB. But I need to truncate my tables before loading data into them, since those jobs work on the whole dataset.

To do this, I'm implementing the solution given here: https://stackoverflow.com/a/50984173/11952393.

It uses the pure Python library pg8000. I followed the guidelines in this SO, downloading the library tar, unpacking it, adding the empty __init.py__, zipping the whole think, uploading the zip file to S3 and adding the S3 URL as a Python library in the AWS Glue job config.

When I run the job, the pg8000 module seems to be imported correctly. But then I get the following error:

AttributeError: module 'pg8000' has no attribute 'connect'

I am most certainly doing something wrong... But can't find what. Any constructive feedback is welcome!

Upvotes: 1

Views: 4106

Answers (3)

user8819
user8819

Reputation: 73

I had this exact same error. It turns out that I was assembling the .zip file to be uploaded to AWS incorrectly (at the command prompt on Mac).

I created the .zip file with:

zip ../function.zip *

which resulted in the directories (such as pg8000, but all the others as well) being included, but not their contents. As such, the error message was true! There were no attributes, because there wasn't even a __init__.py file!

Instead, what should be done is:

zip ../function.zip -r *

When done correctly, the contents of the directories should browsable within the AWS Lambda Code editor.

Upvotes: 0

Sandeep Fatangare
Sandeep Fatangare

Reputation: 2144

Add

install_requires = ['pg8000==1.12.5']

in _setup.py file which is generating .egg file

You should able to access library.

Upvotes: 0

Mayukh Ghosh
Mayukh Ghosh

Reputation: 51

Here is what made it work for me.

  1. Do a pip install of the pg8000 package in a separate location

    pip install -t /tmp/ pg8000

  2. You would see 2 directories in the /tmp directory

    pg8000
    scramp
    
  3. Zip the above 2 directories separately

    cd /tmp/
    zip -r pg8000.zip pg8000/
    zip -r scramp.zip scramp/
    
  4. Upload these 2 zip files in an S3 location

  5. While creating the job or the Dev Endpoint mention these 2 zip files in the Python Library Path field

s3://<bucket>/<prefix>/pg8000.zip,s3://<bucket>/<prefix>/scramp.zip

Upvotes: 0

Related Questions