Reputation: 649
I'm trying to load some ndjson data. First, creating a table works flawlessly:
> bq mk --table <project-id>:<my-dataset.newtable> newtable.schema.json
Table '<project-id>:<my-database.newtable>' successfully created.
However, the bq load
command does not:
> bq load --source_format=NEWLINE_DELIMITED_JSON <project-id>:<my-dataset.newtable> gs://<project-id>.appspot.com/newtable.ndjson
BigQuery error in load operation: Not found: Project <project-friendly-name>
Please also note also:
I have no problem running the job from BigQuery's web interface.
I have set <project-id>
as the default project through the bq init
command, but I get the same error, even when creating a table, when I don't specify it.
Is there an issue with some environment variables that have not been set correctly ?
Upvotes: 4
Views: 12560
Reputation: 1015
In May 2021 I still hit this problem when using bq load with files above a certain size.
The workaround described on a bug-tracker in August 2020 worked for me: https://github.com/googleapis/google-api-python-client/issues/1006
Specifically, I edited the file ~/google-cloud-sdk/platform/bq/third_party/httplib2/python3/__init__.py .
Find this line:
REDIRECT_CODES = frozenset((300, 301, 302, 303, 307, 308))
and remove the number 308 from the set.
I suspect this is not the correct fix but it got me going, and since it only affects a httplib2 used by the bq command it will hopefully not have other harmful effects, but beware.
Upvotes: 0
Reputation: 978
My solution for this error was that I had to omit the location=[LOCATION]
option for the bq load
command. I did not see any default location for my GCP project.
Upvotes: 0
Reputation: 39
The bq load
command usually follows the following structure.
bq --location=[LOCATION] load --source_format=[FORMAT] [DATASET].[TABLE] [PATH_TO_SOURCE] [SCHEMA]
As in the standard bq load
command, you don't have to mention the project if you are loading data within the same project that you have logged in you cli. Also you need to mention the schema unless you have auto detect flag set in you command.
The following command allows you to identify the project that you have access to.
gcloud config list
Upvotes: 2
Reputation: 649
Ok. Interestingly and unlike with bq mk
, with bq load
, selecting the <project-id>
with [PROJECT_ID]:[DATASET].[TABLE]
, or throught bq init
(and the --location=[LOCATION]
option with a fully-qualified Cloud Storage URI as a file) is still irrelevant.
I still had to run either:
gcloud config set project <project-id>
bq load --project_id=<project-id> ...
or
gcloud init
and choose the targetted project as a default.So to sum up, this works:
bq load --project_id=<project-id> --source_format=NEWLINE_DELIMITED_JSON <my-dataset.newtable> gs://<project-id>.appspot.com/newtable.ndjson
Upvotes: 7