Alexis Pautrot
Alexis Pautrot

Reputation: 1165

How to monitor very slow data loading in BigQuery

I'm loading uncompressed JSON files into BigQuery in C#, using Google API method BigQueryClient.UploadJsonAsync. Uploaded files are ranging from 1MB to 400MB. I've been uploading many TB of data with no issues those last months. But it appears since two days that uploading to BigQuery has become very slow.

I was able to upload at 600MB/s, but now I'm at most at 15MB/s. I have checked my connection and I'm still able to go over 600MB/s in connection tests like Speed Test.

Also strangely, BigQuery load throughput seems to depend on hours of day. When reaching 3PM PST my throughput is falling to near 5-10MB/s.

I have no idea how to investigate this. Is there a way to monitor BigQuery data loading ?

Upvotes: 1

Views: 274

Answers (1)

shollyman
shollyman

Reputation: 4384

It's unclear if you're measuring time from when you start sending bytes until the load job is inserted, vs the time from when you start sending until the load job is completed. The first is primarily a question of throughput at a network level , whereas the second one also included ingestion time from the BigQuery service. You can examine the load job metadata to help figure this out.

If you're trying to suss out network issues with sites like speedtest, make sure you're choosing a suitable remote node to test against; by default, they favor something with close network locality relative to the client you are testing.

Upvotes: 2

Related Questions