Reputation: 79
I'm trying to connect to Athena using pyathenajdbc.connect(). I have AWS credentials setup via Multi-factor Authentication. When I don't include the AWS Token in connection string I get following error.
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
EROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 0d488c0b-1eed-11e7-bad8-711e54af6b73)
When I include the AWS Token in connection string I get following error -->
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
ERROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 91751051-1eed-11e7-8347-153dfe3d84a6)
Does anyone know what is wrong here??
Here is my entire code.
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()
_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
AWS_SESSION_TOKEN = _current_credentials.token
REGION = "us-east-2"
#athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)
cursor = athena_conn.cursor();
query = 'SELECT * FROM xyz.ABC limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
Upvotes: 0
Views: 3754
Reputation: 1621
Assuming you have a config file under ~/.aws folder that has the region defined , you could use Session().region_name
The following works just fine (Did not have to import OS):
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()
_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
REGION = Session().region_name
athena_conn = connect(access_key=AWS_KEY_ID,
secret_key=AWS_SECRET,
s3_staging_dir='path_to_staging_dir',
region_name=REGION)
cursor = athena_conn.cursor();
query = 'SELECT current_date;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
Upvotes: 2
Reputation: 79
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import os
_current_credentials = Session().get_credentials()
os.environ['AWS_ACCESS_KEY_ID'] = _current_credentials.access_key
os.environ['AWS_SECRET_ACCESS_KEY'] = _current_credentials.secret_key
os.environ['AWS_SESSION_TOKEN'] = _current_credentials.token
athena_conn = connect(s3_staging_dir='s3://your-bucket/',
region_name='us-west-2',
aws_credentials_provider_class='com.amazonaws.athena.jdbc.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider')
cursor = athena_conn.cursor();
query = 'SELECT * FROM schema.table_name limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
Upvotes: 2
Reputation: 311
The issue is not straightforward but i'm guessing it has something to do with your credentials. You should investigate a bit: try printing your keys and verify if they are valid or not.
Here is an alternative that I use to input my credentials :
import configparser
aws_config_file = '~/.aws/config'
Config = configparser.ConfigParser()
Config.read(os.path.expanduser(aws_config_file))
access_key_id = Config['default']['aws_access_key_id']
secret_key_id = Config['default']['aws_secret_access_key']
Otherwise, just to make sure the issue is not related to the jdbc driver, paste the output of the following command
import pyathenajdbc
print(pyathenajdbc.ATHENA_CONNECTION_STRING)
print(pyathenajdbc.ATHENA_DRIVER_CLASS_NAME)
print(pyathenajdbc.ATHENA_DRIVER_DOWNLOAD_URL)
print(pyathenajdbc.ATHENA_JAR)
Upvotes: 0