searain
searain

Reputation: 3301

Insert record to BigQuery streaming vs simple insert - using table->insertRow or table->inserRows vs using bigquery->runquery

If I use table->insertRow() or table->inserRows() functions, that means I am streaming data into bigquery.

https://googlecloudplatform.github.io/google-cloud-php/#/docs/google-cloud/v0.53.0/bigquery/table

But if I use bigquery->runQuery() function to run the insert statements, that is not streaming.

https://googlecloudplatform.github.io/google-cloud-php/#/docs/google-cloud/v0.53.0/bigquery/bigqueryclient?method=runQuery

In what case I should using streaming to insert data? and in what case, I should use bigquery runQuery to run a insert statement?

Thanks!

Upvotes: 13

Views: 20185

Answers (1)

dsesto
dsesto

Reputation: 8178

Just as a complement to the answer that @GrahamPolley provided in the comments to your question (and for the sake of having an answer that is more accessible for the community), let me share some documentation links that may be useful for the questions that you made in this post.

Streaming Inserts vs. DML Insert

Streaming inserts use the tabledata.insertAll method to insert one record at a time in a table, and it does not require that a load job is run. In this documentation page, you can find all the information related to Streaming Inserts, its features, limitations, and quotas.

As stated in the comment to your question, using DML inserts you can only perform 1000 INSERT queries per table per day (and the same limit applies to load jobs too), while for streaming inserts, you don't have a limit per day, and only maximum rows per second and request.

Data availability

Data that is streamed into BigQuery is available shortly (within a few seconds) after the first streaming insertion in a table, even if it is in the buffer. However, this data may be unavailable for other operations (such as copy, export or even some API methods like tabledata.list), so if you are performing any of those, you will first have to check that the buffer is empty. You can do that by checking the response from tables.get in a section named streamingBuffer, which should be empty is the buffer is empty.

Upvotes: 27

Related Questions