Reputation: 5962
I've created a table with partition type day, and I have set time_partitioning_expiration to 1209600 seconds (14 days) from bq
command line tool. I have verified if the settings are correct by running bq show
on table and I can see
"timePartitioning": {
"expirationMs": "1209600000",
"type": "DAY"
},
"type": "TABLE"
However there seems to be data in the partitions that I expected to have been deleted.
SELECT
count(*) as c,
_partitiontime as pDate
FROM [poc.reporting]
group by pDate
;
1 373800 2016-07-17 00:00:00 UTC
2 640800 2016-07-18 00:00:00 UTC
3 373800 2016-07-16 00:00:00 UTC
Is my understanding that setting time_partition_expiration will remove/delete partitions internally correct? If so why hasn't partitions been deleted in my case?
Upvotes: 3
Views: 4345
Reputation: 61
It looks like the behavior described in the top answer from Pavan changed on Dec 13, 2016, and expiration time is now based on the partition date, and NOT the last modified date.
From the most recent docs:
For example, if the partition expiration is set to 60 days and data is written to partition 20170101 on January 30, 2017, then the data expires on January 1, 2017 + 60 days.
Reference: BQ Partitioned Tables
Upvotes: 6
Reputation: 2315
Expiration of partitions is not based on the partition's date but it is based on the time at which storage was exactly added to the table. So, for example, for a table with partition expiration of 14 days, if a partition 20160717's data was restated on 20160730, the data for that partition would expire 14days from 20160730. You can verify the write time of the oldest data in the partition using the following query:
SELECT MSEC_TO_TIMESTAMP(creation_time) from [project:dataset.table$__PARTITIONS_SUMMARY__]
Upvotes: 3