Aditya
Aditya

Reputation: 109

Partial loading of data(columns) from a parquet file into relational table

I have a table of data type VARIANT that holds a parquet file.

Parquet file output

I have a relational table of the format

> CREATE TABLE IF NOT EXISTS covid_data_relational
> (
>   id int identity(1,1),
>  date_dt DATE,
>  state string,
>  value int,
>  population_percent float,
>  change_from_prior_day int,
>  seven_day_change_percent float
>  );

Inserting data into this table populates records only for columns "date" and "state". Any column that has the format cases.* is not populated.

Relational Table Output

The query is as follows:

    insert into covid_data_relational(date_dt, state, value, population_percent, change_from_prior_day, seven_day_change_percent)
 select covid_data_raw:date::date as date_dt,
covid_data_raw:state::string as state,
covid_data_raw:cases.value::int as value,
covid_data_raw:cases.calculated.population_percent::float as population_percent,
covid_data_raw:cases.calculated.change_from_prior_day::int as change_from_prior_day,
covid_data_raw:cases.calculated.seven_day_change_percent::float as seven_day_change_percent
from covid_data_parquet;

Any help is appreciated! Thanks in advance

Upvotes: 0

Views: 262

Answers (1)

Greg Pavlik
Greg Pavlik

Reputation: 11046

It would be better to post sample data as text rather than an image. People can more easily copy and paste the sample to test. So this isn't tested, but it should work.

Take a close look at the cases.calculated.change_from_prior_day key. It is not a nested key in the grandparent path cases.*. It is a hard-coded key formed by a long string with dots.

To extract that, you'll need to specify that the key is the whole string including the dots:

insert into covid_data_relational(date_dt, state, value, population_percent, change_from_prior_day, seven_day_change_percent)
 select covid_data_raw:date::date as date_dt,
covid_data_raw:state::string as state,
covid_data_raw:"cases.value"::int as value,
covid_data_raw:"cases.calculated.population_percent"::float as population_percent,
covid_data_raw:"cases.calculated.change_from_prior_day"::int as change_from_prior_day,
covid_data_raw:"cases.calculated.seven_day_change_percent"::float as seven_day_change_percent
from covid_data_parquet;

Upvotes: 1

Related Questions