How to authenticate local POSTGRESQL server to access Google Cloud Storage

Question

I am new to the cloud and to data engineering as well.

I have a large csv file stored in a GCS bucket. I would like to write a python script to bulk-insert the data into a postgresql database on my local machine using a COPY statement. I cannot figure out the authentication though.

I would like to do something like this:

import psycopg2

conn = psycopg2.connect(database=database,
                        user=user,
                        password=password,
                        host=host,
                        port=port)

cursor = conn.cursor()
file = 'https://storage.cloud.google.com//'
sql_query = f"COPY  FROM {file} WITH CSV"
cursor.execute(sql_query)
conn.commit()
conn.close()

I get this error message:

psycopg2.errors.UndefinedFile: could not open file "https://storage.cloud.google.com//" for reading: No such file or directory HINT: COPY FROM instructs the PostgreSQL server process to read a file. You may want a client-side facility such as psql's \copy.

The same happens when I run the query in psql.

I assume the problem is in authentication. I have set up Application Default Credentials with Google Cloud CLI and when acting like the authenticated user, I can easily download the file using wget. When I switch to postgres user, I get "access denied" error.

The ADC seem to work only with client libraries and command-line tools.

I use Ubuntu 22.04.1 LTS.

Thanks for any help.

How to authenticate local POSTGRESQL server to access Google Cloud Storage

Answers (1)

Related Questions