Ajith Kumara
Ajith Kumara

Reputation: 17

Hive DDL for parquet formate with complex datatypes

Could someone help me to create the Hive DDL for this dataset which was processed and stored in Parquet format..

properties:

{
  "freq": "8600",
  "id": "23266",
  "array": [
    {
      "ver": "201.0.0.F",
      "key_ver": "201.0.0.F",
      "key": "001I1SS",
      "code": "ACDEE",
      "prod_code": "DSADVVSS",
      "prod_key": "001123"
    }
  ],
  "ipm": null,
  "offline": "1234234209600"
}

Upvotes: 0

Views: 674

Answers (1)

gardenhead
gardenhead

Reputation: 2497

CREATE TABLE my_table(freq INT, id INT, array<struct<ver: FLOAT, key_ver: FLOAT, key: STRING, code: STRING, prod_code: STRING, prod_key: INT>>, ipm: **UNKOWN**, offline: BIGINT>

Since JSON has many less types than Hive, we can not get all the information we need from just what you posted. For example, we don't know what the type of ipm should be, and we don't know whether id should be an INT or a BIGINT or so on.

Since you've already converted that JSON file to a Parquet file, you can inspect the Parquet file (which has more types) to get a better idea of what Schema to use.

Upvotes: 1

Related Questions