Reputation: 517
I get the correct count after I run the ANALYZE statement. But my problem is, it needs to be run every time of the count is updated. Technically I should be able to update the count for the same partition.
But it returns the same count if I don't execute the ANALYZE statement.
This is the query I execute for the count to be updated.
ANALYZE TABLE bi_events_identification_carrier_sam PARTITION(year, month, day) COMPUTE STATISTICS;
And executing is not convenient at all. any ideas?
Upvotes: 0
Views: 2004
Reputation: 2224
Your count(*) query is using stats to get the result.
If you are using spark to write data, then you can set spark.sql.statistics.size.autoUpdate.enabled
to true
. This makes sure that Spark updates the table stats automatically after the write is done.
If you are using Hive, you can set set hive.stats.autogather=true;
.
Once you enable these settings, then the write query will automatically update the stats and the subsequent read query will work fine.
Upvotes: 2