DJo
DJo

Reputation: 2177

S3 Query Exception (Fetch)

I have uploaded data from Redshift to S3 in Parquet format and created the data catalog in Glue. I have been able to query the table from Athena but when I create the external schema on Redshift and tried to query on the table I'm getting the below error

ERROR:  S3 Query Exception (Fetch)
DETAIL:
  -----------------------------------------------
  error:  S3 Query Exception (Fetch)
  code:      15001
  context:   Task failed due to an internal error. File 'https://s3-eu-west-1.amazonaws.com/bucket/folder/partition_key/filename.parquet_1  has an incompatible Parquet schema for column 's3://bucket/folder
  query:     560922
  location:  dory_util.cpp:717
  process:   query1_118_560922 [pid=32409]
  -----------------------------------------------

The queries are workinh well in Athena

Upvotes: 2

Views: 6123

Answers (2)

Drew
Drew

Reputation: 751

I've run into this before as well. Athena does not seem to have as strict checking on the file schema's as Redshift does.

Every single parquet files has a schema definition in it. If the schema definition in the file does not match the table definition or differs from one or more of the other files, Redshift queries will fail while Athena queries may succeed if the affected columns are not in the query.

Upvotes: 0

LauriK
LauriK

Reputation: 1929

It kind of tells you what's wrong - the schema of table/partition and the file contents differ too much. The easiest way to fix that would be to run a crawler over the data location with the "update each partition definition from table" checked.

Upvotes: 1

Related Questions