user13949481
user13949481

Reputation:

How to push a pandas dataframe to bigquery with `DATE` types?

So I have a dataframe which looks like this:

2020-08-23  3956.5801   3244.0070
2020-08-24  3674.2331   2860.3648
2020-08-25  3422.3573   2953.4077
2020-08-26  4493.0424   2876.9254
2020-08-27  3941.6706   2849.1039
2020-08-28  3799.7326   2682.6734
2020-08-29  4062.6515   2863.2884
2020-08-30  3775.3997   2970.6699
2020-08-31  3405.6144   3071.7572
2020-09-01  3630.2723   2651.4049

The date column is of datetime type currently. When I push this to BQ, it becomes TIMESTAMP on BigQuery. I tried converting it to date only using df.date = pd.to_datetime(df.date).dt.date and it is converted but when I push this, it becomes STRING. Many dashboards are using it so I cannot compromise on datatype. I also cannot keep manually exporting csv and importing into bigquery because whole point of pushing this programmatically is to make a script which can do this on scheduled basis. Current workaround is to make a csv and infer schema automatically.

Upvotes: 3

Views: 4627

Answers (1)

Erfan
Erfan

Reputation: 42946

If you are using pandas.to_gbq, you can give the schema as an argument:

schema = [
    {'name': 'date', 'type': 'DATE'}
    {'name': 'close_high', 'type': 'FLOAT64'},
    {'name': 'close_low', 'type': 'FLOAT64'}
]

df['date'] = pd.to_datetime(df['date']).dt.date  # try skipping this line as well
df.to_gqb(tablename, project_id, table_schema=schema)

Upvotes: 5

Related Questions