Machine Learning
Machine Learning

Reputation: 515

How to save costs when calling BigQuery from a Google Cloud Function

I have 10,000 images added to my google storage bucket on a daily basis.

This bucket has a cloud function event which triggers BigQuery scans

My cloud function checks for existing records in BigQuery 10,000 times a day, which is running up my BigQuery bill to unsustainable amount.

I would like to know if there is a way to query the database once and store the results into a variable which can then be available for all triggered cloud functions?

In summary: query DB once, store query results, use query results for all cloud function invocations. This way I do not hit BigQuery 10,000+ times a day.

P.S. BigQuery processed 288 Terabytes of data which is an insane bill $$$$$$$$$$$$

Upvotes: 1

Views: 216

Answers (1)

Felipe Hoffa
Felipe Hoffa

Reputation: 59235

That's an insane bill.

Step 1: Set up cost controls. You will never deal with an insane bill again.

Step 2: Instead of querying again, check out the results of the previous job. If it's the same query and data, the same results are still valid. The API method is jobs.getQueryResults().

Step 3: Cluster your tables. You'll probably find 90% additional cost reductions.

Upvotes: 2

Related Questions