Wiil
Wiil

Reputation: 649

bq load: "BigQuery error in load operation: Not found: Project ..."

I'm trying to load some ndjson data. First, creating a table works flawlessly:

> bq mk --table <project-id>:<my-dataset.newtable> newtable.schema.json
Table '<project-id>:<my-database.newtable>' successfully created.

However, the bq load command does not:

> bq load --source_format=NEWLINE_DELIMITED_JSON <project-id>:<my-dataset.newtable> gs://<project-id>.appspot.com/newtable.ndjson
BigQuery error in load operation: Not found: Project <project-friendly-name>

Please also note also:

Is there an issue with some environment variables that have not been set correctly ?

Upvotes: 4

Views: 12560

Answers (4)

Rob Fisher
Rob Fisher

Reputation: 1015

In May 2021 I still hit this problem when using bq load with files above a certain size.

The workaround described on a bug-tracker in August 2020 worked for me: https://github.com/googleapis/google-api-python-client/issues/1006

Specifically, I edited the file ~/google-cloud-sdk/platform/bq/third_party/httplib2/python3/__init__.py .

Find this line: REDIRECT_CODES = frozenset((300, 301, 302, 303, 307, 308)) and remove the number 308 from the set.

I suspect this is not the correct fix but it got me going, and since it only affects a httplib2 used by the bq command it will hopefully not have other harmful effects, but beware.

Upvotes: 0

James T.
James T.

Reputation: 978

My solution for this error was that I had to omit the location=[LOCATION] option for the bq load command. I did not see any default location for my GCP project.

Upvotes: 0

Inam Imthiyaz
Inam Imthiyaz

Reputation: 39

The bq load command usually follows the following structure.

bq --location=[LOCATION] load --source_format=[FORMAT] [DATASET].[TABLE] [PATH_TO_SOURCE] [SCHEMA]

As in the standard bq load command, you don't have to mention the project if you are loading data within the same project that you have logged in you cli. Also you need to mention the schema unless you have auto detect flag set in you command.

The following command allows you to identify the project that you have access to.

gcloud config list

Upvotes: 2

Wiil
Wiil

Reputation: 649

Ok. Interestingly and unlike with bq mk, with bq load, selecting the <project-id> with [PROJECT_ID]:[DATASET].[TABLE], or throught bq init (and the --location=[LOCATION] option with a fully-qualified Cloud Storage URI as a file) is still irrelevant.

I still had to run either:

  • gcloud config set project <project-id>
  • bq load --project_id=<project-id> ...

or

  • gcloud init and choose the targetted project as a default.

So to sum up, this works:

bq load --project_id=<project-id> --source_format=NEWLINE_DELIMITED_JSON <my-dataset.newtable> gs://<project-id>.appspot.com/newtable.ndjson

Upvotes: 7

Related Questions