Reputation: 528
I'm trying to load data from a Pandas DataFrame
s into a BigQuery table. The DataFrame
has a column of dtype datetime64[ns]
, and when I try to store the df using load_table_from_dataframe()
, I get
google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table [table name]. Field computation_triggered_time has changed type from DATETIME to TIMESTAMP.
The table has a schema which reads
CREATE TABLE `[table name]` (
...
computation_triggered_time DATETIME NOT NULL,
...
)
In the DataFrame
, computation_triggered_time
is a datetime64[ns]
column. When I read the original DataFrame
from CSV, I convert it from text to datetime like so:
df['computation_triggered_time'] = \
df.to_datetime(df['computation_triggered_time']).values.astype('datetime64[ms]')
Note:
The .values.astype('datetime64[ms]')
part is necessary because load_table_from_dataframe()
uses PyArrow to serialize the df and that fails if the data has nanosecond-precision. The error is something like
[...] Casting from timestamp[ns] to timestamp[ms] would lose data
Upvotes: 1
Views: 1622
Reputation: 105611
This looks like a problem with Google's google-cloud-python package, can you report the bug there? https://github.com/googleapis/google-cloud-python
Upvotes: 1