Guddi
Guddi

Reputation: 79

Failing to querying AWS Athena from Python 2.7, when passing AWS session Token in pyathenajdbc.connect()

I'm trying to connect to Athena using pyathenajdbc.connect(). I have AWS credentials setup via Multi-factor Authentication. When I don't include the AWS Token in connection string I get following error.

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

EROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 0d488c0b-1eed-11e7-bad8-711e54af6b73)

When I include the AWS Token in connection string I get following error -->

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION) ERROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 91751051-1eed-11e7-8347-153dfe3d84a6)

Does anyone know what is wrong here??

Here is my entire code.

from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()

_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
AWS_SESSION_TOKEN = _current_credentials.token
REGION = "us-east-2"

#athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

cursor = athena_conn.cursor();
query = 'SELECT * FROM xyz.ABC  limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)

Upvotes: 0

Views: 3754

Answers (3)

nizantz
nizantz

Reputation: 1621

Assuming you have a config file under ~/.aws folder that has the region defined , you could use Session().region_name

The following works just fine (Did not have to import OS):

from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()

_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
REGION = Session().region_name

athena_conn = connect(access_key=AWS_KEY_ID,
               secret_key=AWS_SECRET,
               s3_staging_dir='path_to_staging_dir',
               region_name=REGION)

cursor = athena_conn.cursor();

query = 'SELECT current_date;'

cursor.execute(query)
df = as_pandas(cursor)
print(df)

Upvotes: 2

Guddi
Guddi

Reputation: 79

from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import os

_current_credentials = Session().get_credentials()

os.environ['AWS_ACCESS_KEY_ID'] = _current_credentials.access_key
os.environ['AWS_SECRET_ACCESS_KEY'] = _current_credentials.secret_key
os.environ['AWS_SESSION_TOKEN'] = _current_credentials.token


athena_conn = connect(s3_staging_dir='s3://your-bucket/',
           region_name='us-west-2',
           aws_credentials_provider_class='com.amazonaws.athena.jdbc.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider')

cursor = athena_conn.cursor();
query = 'SELECT * FROM schema.table_name limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)

Upvotes: 2

Tewfik Ghariani
Tewfik Ghariani

Reputation: 311

The issue is not straightforward but i'm guessing it has something to do with your credentials. You should investigate a bit: try printing your keys and verify if they are valid or not.

Here is an alternative that I use to input my credentials :

import configparser    

aws_config_file = '~/.aws/config'

Config = configparser.ConfigParser()
Config.read(os.path.expanduser(aws_config_file))

access_key_id = Config['default']['aws_access_key_id']
secret_key_id = Config['default']['aws_secret_access_key']

Otherwise, just to make sure the issue is not related to the jdbc driver, paste the output of the following command

import pyathenajdbc 

print(pyathenajdbc.ATHENA_CONNECTION_STRING)
print(pyathenajdbc.ATHENA_DRIVER_CLASS_NAME)
print(pyathenajdbc.ATHENA_DRIVER_DOWNLOAD_URL)
print(pyathenajdbc.ATHENA_JAR)

Upvotes: 0

Related Questions