Reputation: 454
We are using external tables in our Snowflake database, in order to read data from some AWS S3 buckets. The buckets contain various parquet files, spread over multiple partitions.
We are able to read the data from our external table by using Snowflake's stages, storage integrations and file formats.
However, we'd like to read some metadata from the parquet files as well, such as the precision of numeric data types (e.g., to find out how many decimal places we have to deal with).
To keep it simple, let's say we're reading data from one single parquet file.
Is there any way to retrieve metadata from that parquet file as to the precision of numeric data types, directly from Snowflake?
Or would you rather extract that metadata from, let's say, Glue Catalog or any other external tool?
Upvotes: 4
Views: 1683
Reputation: 11046
There's a recent public preview that infers schema that will do this:
INFER_SCHEMA(
LOCATION => '{ internalStage | externalStage }'
, FILE_FORMAT => '<format_name>'
)
https://docs.snowflake.com/en/sql-reference/functions/infer_schema.html
Upvotes: 5