beni
beni

Reputation: 103

how to view data catalog table in S3 using redshift spectrum

I created external schema for my database in aws glue. I can see the list of table but I cannot look into the json data. redshift throws me this errors.

[Amazon](500310) Invalid operation: S3 Query Exception (Fetch)
Details: 
 -----------------------------------------------
  error:  S3 Query Exception (Fetch)
  code:      15001
  context:   Task failed due to an internal error. Error occured during Ion/JSON extractor match: IERR_INVALID_SYNTAX

  query:     250284
  location:  dory_util.cpp:717
  process:   query2_124_250284 [pid=12336]
  -----------------------------------------------;
1 statement failed.

I dont want to create external tables because I will create a view combining the external tables in the data catalog in aws glue.

Just an update:

I used aws glue crawler in creating the tables in the data catalog. They are in json format. If I use a job that will upload this data in redshift they are loaded as flat file (except arrays) in redshift table.

Example of json data:

{
  "array": [
    1,
    2,
    3
  ],
  "boolean": true,
  "null": null,
  "number": 123,
  "object": {
    "a": "b",
    "c": "d",
    "e": "f"
  },
  "string": "Hello World"
}

If I upload them using a job in aws glue the output will be like (as table)

see image

Now, I have trmendous amount of tables crawled in data catalog. I am struggling creating the individual script of this tables that is why an amazon redshift spectrum external schema can be helpful.

However when I query the external table in the external schema I am getting the error as posted above. I do not encounter problems with external tables from the data catalog if they are loaded as csv but the format files I need to read in redshift spectrum should be in json.

Is it possible to view the external table in redshift spectrum the same format when it is loaded using a job?

Upvotes: 0

Views: 1741

Answers (1)

SunSmiles
SunSmiles

Reputation: 186

beni,
The errors thrown by RedShift Spectrum may not always be accurate. I can only confirm the querying with JSON should work similar to other data formats. By the way, the external table needs to be corrected through SQL client within spectrum database.

So, I will suggest to refer this and this to review your steps

Upvotes: 0

Related Questions