Reputation: 11
I want to know about the way of saving BigQuery data capacity with changing setting of Data Portal(Google BI tool/old name:Data Studio).
The reason is I can't execute SQL or defray the much cost , if I don't save my BigQuery data capacity .
I want to know the way is not used Changing BigQuery Setting(contain of change SQL code) , but Data Protal setting.
Because , the dashboard in data portal continue to use BigQuery data capacity , I can't solve my problem ,even if I change the SQL code.
My situations is below:
My situations:
1.I made a "view" in my BigQuery Enviroment.
I tried to make the query not to use a lot of BigQuery data capacity.
For example , I didn't use "SELECT * FROM ...".
I set the view to "data sorce" in the data portal.
And I made the dashboard using the "data sorce".
If someone open the dashboard , the view I made is executed.
And , BigQuery data capacity is used every time that someone open the dashboard.
Upvotes: 0
Views: 192
Reputation: 1387
If I'm understanding correctly, you're wanting to reduce the amount of data processed in BigQuery from your Data Studio (or in Japan, Data Portal) reports.
There are a few ways to do this:
Make sure that the "Enable Cache" option is checked in the report settings.
Avoid using BigQuery views as a query source, as these aren't cached at the BigQuery level (the view query is run every time, and likely many times per report for various charts). Instead, use a Custom Query connection or pull the table data directly to allow caching. Another option (which we use heavily) is to run a scheduled query that saves the output of a view as a table and replaces it regularly (or is triggered when the underlying data is refreshed). This way your queries can be cached, but the business logic can still exist within the view.
Create a BI Engine reservation in BigQuery. This adds another level of caching to Data Studio reports, and may give you better results for things that can't be query-cached or cached in Data Studio. (While there will be a cost to the service in the future based on the size of instance you reserve, it's free during their beta period.)
Don't base your queries on tables with a streaming buffer attached (even if it hasn't received rows recently), uses wildcard tables in the query, or is based on an external dataset (e.g. file in Cloud Storage or BigTable). See Caching Exceptions for details.
Pull as little data as possible by using the new Data Source Parameters. This means you can pass the values of your date range or other filters directly to BigQuery and filter the data before it reaches your report. This is especially helpful if you have a date-partitioned table, as you can only scan the needed partitions (which greatly reduces processing and the amount of data returned)
Also, sometimes it seems like you're moving a lot of data but that doesn't always relate to a high cost. Check your cost breakdowns or look at the logging filtered to the user your data source authenticates as, then see how much cost that's incurred. Certain operations fall under a free tier, and others don't result in cost for non-egress use cases like Data Studio. All that to say that you may want to make sure there's a cost problem at the BigQuery level in the first place before killing yourself trying to optimize the usage.
Upvotes: 1