Reputation: 17

Hive DDL for parquet formate with complex datatypes

Could someone help me to create the Hive DDL for this dataset which was processed and stored in Parquet format..

properties:

{
  "freq": "8600",
  "id": "23266",
  "array": [
    {
      "ver": "201.0.0.F",
      "key_ver": "201.0.0.F",
      "key": "001I1SS",
      "code": "ACDEE",
      "prod_code": "DSADVVSS",
      "prod_key": "001123"
    }
  ],
  "ipm": null,
  "offline": "1234234209600"
}

Upvotes: 0

Answers (1)

gardenhead

Reputation: 2497

CREATE TABLE my_table(freq INT, id INT, array<struct<ver: FLOAT, key_ver: FLOAT, key: STRING, code: STRING, prod_code: STRING, prod_key: INT>>, ipm: **UNKOWN**, offline: BIGINT>

Since JSON has many less types than Hive, we can not get all the information we need from just what you posted. For example, we don't know what the type of ipm should be, and we don't know whether id should be an INT or a BIGINT or so on.

Since you've already converted that JSON file to a Parquet file, you can inspect the Parquet file (which has more types) to get a better idea of what Schema to use.

Upvotes: 1

Hive DDL for parquet formate with complex datatypes

Answers (1)

Related Questions