user9114945
user9114945

Reputation: 269

BigQuery - Can only fetch 100,000 Records

I have a table with around 50,000,000 records. I would like to fetch one column of the whole table

SELECT id FROM `project.dataset.table`

Running this code in the Web Console takes around 80 seconds.

However when doing this with the Ruby Gem, I'm limited to fetch only 100,000 records per query. With the #next method I can access the next 100,000 records.

require "google/cloud/bigquery"

@big_query = Google::Cloud::Bigquery.new(
  project: "project",
  keyfile: "keyfile"
)

@dataset = @big_query.dataset("dataset")
@table   = @dataset.table("table")

queue = @big_query.query("SELECT id FROM `project.dataset.table`", max: 1_000_000)
stash = queue

loop do
  queue = queue.next
  unless queue
    break
  else
    O.timed stash.size
    stash += queue
  end
end

The problem with this is that each request takes around 30 seconds. max: 1_000_000 is of no use, I'm stuck at 100,000. This way the query takes over 4 hours, which is not acceptable.

What am I doing wrong?

Upvotes: 1

Views: 940

Answers (1)

Pentium10
Pentium10

Reputation: 207952

You should rather do an export job, this way you will have as file(s) on GCS. Then downloading from there is easy.

https://cloud.google.com/bigquery/docs/exporting-data

Ruby example here https://github.com/GoogleCloudPlatform/google-cloud-ruby/blob/master/google-cloud-bigquery/lib/google/cloud/bigquery.rb

Upvotes: 2

Related Questions