Reputation: 203
I'm trying to query large amount of data in BigQuery and then upload the table in the desired dataset (datasetxxx) using "datalab" in PyCharm as the IDE. Below is my code:
query = bq.Query(sql=myQuery)
job = query.execute_async(
output_options=bq.QueryOutput.table('datasetxxx._tmp_table', mode='overwrite', allow_large_results=True))
job.result()
However, I ended up with "No project ID found". Project Id is imported through a .jason file as os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = path to the file. I also tried to explicitly declare project Id above as follows.
self.project_id = 'xxxxx'
query = bq.Query(sql=myQuery, context = self.project_id)
This time I ended up with the following error:
TypeError: init() got an unexpected keyword argument 'context'.
It's also an up-to-date version. Thanks for your help.
Re: The project Id is specified in the "FROM" clause and I'm also able to see the path to the .json file using "echo" command. Below is the stack-trace:
Traceback (most recent call last):
File "xxx/Queries.py", line 265, in <module>
brwdata._extract_gbq()
File "xxx/Queries.py", line 206, in _extract_gbq
, allow_large_results=True))
File "xxx/.local/lib/python3.5/site packages/google/datalab/bigquery/_query.py", line 260, in execute_async
table_name = _utils.parse_table_name(table_name, api.project_id)
File "xxx/.local/lib/python3.5/site-packages/google/datalab/bigquery/_api.py", line 47, in project_id
return self._context.project_id
File "xxx/.local/lib/python3.5/site-packages/google/datalab/_context.py", line 62, in project_id
raise Exception('No project ID found. Perhaps you should set one by running'
Exception: No project ID found. Perhaps you should set one by running"%datalab project set -p <project-id>" in a code cell.
Upvotes: 0
Views: 622
Reputation: 5644
Here's the updated way if someone in need:
Now you can use the Context
in latest version as:
from google.datalab import bigquery as bq
from google.datalab import Context as ctx
ctx.project_id = 'PROJECT_ID'
df = bq.Query(query).execute()
...
Upvotes: 1
Reputation: 879
So, if you do "echo $GOOGLE_APPLICATION_CREDENTIALS" you can see the path of your JSON. So, could you make sure if the "FROM" from the query has specified the right external project? Also, if your QueryOutput destination is your very same project, you are doing it right,
table('dataset.table'.....)
But in order case you should specify:
table('project.dataset.table'....)
I don't exactly know how are you doing the query but the error might be there.
I reproduced this and it worked fine to me:
import google.datalab
from google.datalab import bigquery as bq
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] ="./bqauth.json"
myQuery="SELECT * FROM `MY_EXAMPLE_PROJECT.MY_EXAMPLE_DATASET.MY_EXAMPLE_TABLE` LIMIT 1000"
query = bq.Query(sql=myQuery)
job = query.execute_async(
output_options=bq.QueryOutput.table('MY_EXAMPLE_PROJECT.MY_EXAMPLE_DATASET2.MY_EXAMPLE_TABLE2', mode='overwrite', allow_large_results=True))
job.result()
Upvotes: 1