Sean
Sean

Reputation: 31

How to use AWS Glue metadata in queries with the DynamoDB-Athena Connector

I am trying to use the Athena Federated query system with the pre-built Athena-DynamoDB Connector. I have the connector setup so I can run queries like this:

SELECT * FROM "lambda:<connector>".[Something]."DynamoDB Table Name"

However, with some tables I get the following Java error from the connector:

GENERIC_USER_ERROR: Encountered an exception[null] from your LambdaFunction[< connector>] executed in context[retrieving meta-data] with message[Unexpected error executing Lambda function]

I believe this error comes from the connector's limited ability to infer table schema, causing it to raise a null reference exception when it encounters missing data in the table, as referenced here. So I am attempting to use the suggested solution:

As a work around, you can define the schema of this DDB table in Glue. That will cause the connector to bypass it's schema inference capability and perhaps stop the error so you can continue your work while we investigate.

I have setup a glue crawler that crawls the table that has the issue, I have run the glue crawler, and the table's metadata is viewable in the Glue console. However, I don't understand how to actually use this metadata instead of using the connector's schema inference. Any queries on the offending table return the same errors.

Some other information that might be relevant, but I'm not sure:

  1. The table's name in Dynamo has capital letters like this: MyTable. This gets changed in glue to all lowercase like: mytable
  2. The table contains columns with capital letters in the name like this: MyCol1, MyCol2, etc. These get changes in glue to all lowercase like: mycol1, mycol2, etc
  3. All resources have been partitioned using CloudFormation instead of the console. If anything needs to be partitioned, I would prefer to do it this way instead of via the console.
  4. The [Something] in the query above is the database name, but it seems like any arbitrary input works. I suspect that I might have to specify this to get the queries working with glue, but that's just a guess and nothing I've tried so far has worked

Upvotes: 2

Views: 726

Answers (1)

Agani Satria
Agani Satria

Reputation: 107

Little bit forgot about this, but i suspect 2 problems here.

  1. if your table name or column name contains capital letter, you should mapping it by yourself. You can edit table on glue table and add extra table properties.
  • for tablename: key: sourceTable value: yourDynamoDBTableName

  • for columnNames: key: columnMapping value: glueColumn1=DynamoDBColumn1, glueColumn2=DynamoDBColumn2

  1. if you're not creating lambda through SAM Application template, can you confirm your handler function is correct? or maybe you can put on here.

Upvotes: 0

Related Questions