bigmellow
bigmellow

Reputation: 11

BigQuery load job from pandas dataframe timestamp column reading as unix nanoseconds, not microseconds

I have had a script running for a few months but ran into an issue today in a load job from a pandas df, with a timestamp column

df.published_at[0]

gives

Timestamp('2022-04-28 20:59:51-0700', tz='pytz.FixedOffset(-420)')

When I try to load to BigQuery through a load job, I get the following error:

[{'reason': 'invalidQuery', 'location': 'query', 'message': 'Cannot return an invalid timestamp value of 1651204791000000000 microseconds relative to the Unix epoch. The range of valid timestamp values is [0001-01-01 00:00:00, 9999-12-31 23:59:59.999999]; error in writing field published_at'}]

It seems that BigQuery is somehow reading that timestamp as Unix nanoseconds (1651204791000000000), not microseconds (which would be 1651204791000000) which is putting it out of the range of acceptable values. Why is it doing that?

I used a workaround to just use a string for that column before the load job, and the BQ schema accepts it as a timestamp. I'm just curious why this issue might have come up now and not previously?

Upvotes: 1

Views: 586

Answers (1)

JulioB
JulioB

Reputation: 71

I come here 5 months later (29 September 2022) because I have the exact same problem.

I'm trying to load data to BigQuery, from python, with client.load_table_from_json. One of my columns is a "processed_at" column which stores datetime objects (dtype: datetime64[ns, UTC]). I specify the right type in my table_schema :

table_schema = [
    bigquery.SchemaField("processed_at", "TIMESTAMP", mode="NULLABLE")
]

I get this error :

BadRequest: 400 Cannot return an invalid timestamp value of 1664454374000000000 microseconds relative to the Unix epoch.
The range of valid timestamp values is [0001-01-01 00:00:00, 9999-12-31 23:59:59.999999]; error in writing field processed_at

BigQuery really seems to think in microsecondes, instead of nanoseconds, and thus all my datetimes fall out of range.

I will try to cast them as strings, thanks for the workaround.

Upvotes: 0

Related Questions