Reputation: 3301
If I use table->insertRow() or table->inserRows() functions, that means I am streaming data into bigquery.
https://googlecloudplatform.github.io/google-cloud-php/#/docs/google-cloud/v0.53.0/bigquery/table
But if I use bigquery->runQuery() function to run the insert statements, that is not streaming.
In what case I should using streaming to insert data? and in what case, I should use bigquery runQuery to run a insert statement?
Thanks!
Upvotes: 13
Views: 20185
Reputation: 8178
Just as a complement to the answer that @GrahamPolley provided in the comments to your question (and for the sake of having an answer that is more accessible for the community), let me share some documentation links that may be useful for the questions that you made in this post.
Streaming inserts use the tabledata.insertAll
method to insert one record at a time in a table, and it does not require that a load job is run. In this documentation page, you can find all the information related to Streaming Inserts, its features, limitations, and quotas.
As stated in the comment to your question, using DML inserts you can only perform 1000 INSERT queries per table per day (and the same limit applies to load jobs too), while for streaming inserts, you don't have a limit per day, and only maximum rows per second and request.
Data that is streamed into BigQuery is available shortly (within a few seconds) after the first streaming insertion in a table, even if it is in the buffer. However, this data may be unavailable for other operations (such as copy, export or even some API methods like tabledata.list
), so if you are performing any of those, you will first have to check that the buffer is empty. You can do that by checking the response from tables.get
in a section named streamingBuffer
, which should be empty is the buffer is empty.
Upvotes: 27