How to show column data type of a parquet file with Apache Drill?

Question

I'm trying to compare differences in parquet files. One set was created with Apache Drill and another with Apache Spark. The set created with Drill has known types because the conversion uses a create table as and explicitly casts the types. The Spark created set uses a simple save of the RDD to parquet and is much larger. I'd like to get the types from the parquet file created by Spark but can't query the schema for it with Drill.

All the parquet files were moved into or created in /tmp

I've tried things like this:

use dfs.tmp; 
SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = `tweet` AND TABLE_SCHEMA = `dfs.tmp`;

The tables don't show this way but do show up when I issue a show files command. My understanding of the documentation is that is to be expected but I don't see how I can view the data types of the parquet files.

How to show column data type of a parquet file with Apache Drill?

All the parquet files were moved into or created in /tmp

Answers (1)

Related Questions