Myone
Myone

Reputation: 1131

Protect against BigQuery spam

I have the following software components:

Fact: BigQuery bills per query and flatrate is very expensive.

Problem: if someone spams the GET /product-purchase-event endpoint, every call will execute a new query meaning after 1 million spam queries I will get a very nice bill.

My question: Can you spam protect BigQuery? I know there's a 24h cache, but I want data to be as real-time as possible.

I also know there are other solutions such as Amazon Redshift which bills per hour and not per query, but I wanted to know if I could solve this spam issue with BigQuery. It seems like most people use it for in-house only, meaning no external person can execute queries, so then spam isn't an issue.

Upvotes: 1

Views: 64

Answers (1)

shollyman
shollyman

Reputation: 4384

As you've correctly surmised, its a Bad Idea(tm) to wire interactive public web endpoints to a handler that runs a BigQuery query directly. There are multiple factors here including cost and latency. Additionally, querying a table that's receiving streaming inserts means that you won't be able to leverage the basic BigQuery caching mechanism and will quickly reach concurrency limits once your public handler starts to get a reasonable amount of load.

A more typical pattern here is to compute your aggregates periodically via a BigQuery query, and then read and propagate those query results into a storage layer/system that's more appropriate for serving results in a point lookup fashion. For example, something like datastore, or an in-memory keyvalue store, or even something like cloud SQL.

This decouples your serving architecture from the data processing, and the public handler is greatly simplified: it simply fetches the aggregate from the storage layer.

You can also deal with the "when do we recompute" with far more nuance. You can define your processing to simply re-run on a fixed interval, leverage awareness of data staleness, or build some custom caching semantics based on other signals in your environment.

Upvotes: 1

Related Questions