Reputation: 23
I am trying to write a Python script to access Amazon Redshift to create a table in Redshift and copy data from S3 to the Redshift table.
My code is:
import psycopg2
import os
#import pandas as pd
import requests
requests.packages.urllib3.disable_warnings()
redshift_endpoint = os.getenv("END-point")
redshift_user = os.getenv("user")
redshift_pass = os.getenv("PASSWORD")
port = 5439
dbname = 'DBNAME'
conn = psycopg2.connect(
host="",
user='',
port=5439,
password='',
dbname='')
cur = conn.cursor()
aws_key = os.getenv("access_key") # needed to access S3 Sample Data
aws_secret = os.getenv("secret_key")
#aws_iam_role= os.getenv('iam_role') #tried using this too
base_copy_string= """copy %s from 's3://mypath/%s'.csv
credentials 'aws_access_key_id= %s aws_access_secrect_key= %s'
delimiter '%s';""" # the base COPY string that we'll be using
#easily generate each table that we'll need to COPY data from
tables = ["employee"]
data_files = ["test"]
delimiters = [","]
#the generated COPY statements we'll be using to load data;
copy_statements = []
for tab, f, delim in zip(tables, data_files, delimiters):
copy_statements.append(base_copy_string % (tab, f, aws_key, aws_secret, delim)%)
#create Table
cur.execute(""" create table employee(empname varchar(30),empno integer,phoneno integer,email varchar(30))""")
for copy_statement in copy_statements: # execute each COPY statement
cur.execute(copy_statement)
conn.commit()
for table in tables + ["employee"]:
cur.execute("select count(*) from %s;" % (table,))
print(cur.fetchone())
conn.commit() # make sure data went through and commit our statements permanently.
When I run this command I getting an Error at cur.execute(copy_statement)
**Error:** error: Invalid credentials. Must be of the format: credentials 'aws_iam_role=...' or 'aws_access_key_id=...;aws_secre
t_access_key=...[;token=...]'
code: 8001
context:
query: 582
location: aws_credentials_parser.cpp:114
process: padbmaster [pid=18692]
Is there a problem in my code? Or is it is an AWS access_key problem?
I even tried using an iam_role but I get an error:
IAM role cannot assume role even in Redshift
I have a managed IAM role permission by attaching S3FullAccess policy.
Upvotes: 2
Views: 2876
Reputation: 2863
There are some errors in your script.
1) Change base_copy_string as below:
base_copy_string= """copy %s from 's3://mypath/%s.csv' credentials 'aws_access_key_id=%s;aws_secret_access_key=%s' delimiter '%s';""" # the base COPY string that we'll be using
There must be a ;
added in credentials and also other formatting issues with single-quotes. It is aws_secret_access_key
and not aws_access_secrect_key
.
check this link for detailed info: http://docs.aws.amazon.com/redshift/latest/dg/copy-usage_notes-access-permissions.html#copy-usage_notes-iam-permissions
I suggest you use iam-roles instead of credentials. http://docs.aws.amazon.com/redshift/latest/dg/loading-data-access-permissions.html
2) change copy_statements.append as below(remove extra %
in the end):
copy_statements.append(base_copy_string % (tab, f, aws_key, aws_secret, delim))
Correct these and try again.
Upvotes: 2
Reputation: 1
Just in case if you missed Install AWS CLI Run aws configure Put your credentials and region Hope this helps.
Upvotes: -1
Reputation: 553
To start with, NEVER, NEVER, NEVER hardcode access keys and secret keys in your code. So that rules out your first query. Now coming to right way of implementing things. You are right, IAM Role is the right way of doing it. Unfortunately, I can't get the exact error and use case from your description. As far as I understand, you are trying to run this python file from your computer(local machine). Hence, you need to attach permission with your IAM user to have access to RedShift(and all other services your code is touching). Please correct me if my assumption is wrong.
Upvotes: 1