Reputation: 463
I have deployed BI Engine in one of my Google projects and I am measuring cost savings using the following query
with tbl
as
(
select creation_time, total_bytes_processed, total_bytes_billed,
5 * (total_bytes_processed / 1000000000000) as cost_projected,
5 * (total_bytes_billed / 1000000000000) as cost_actual
from `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT b
where 1=1
and job_type = "QUERY"
and creation_time >= '2022-05-10 11:30:00.000 UTC'
and creation_time <= '2022-05-10 19:00:00.000 UTC'
)
select sum(cost_projected) - sum(cost_actual) as savings
from tbl
where 1=1
;
However, I noticed that very often I have accelerated queries (bi_engine_statistics.bi_engine_mode = 'FULL') for which 'total_bytes_billed = total_bytes_processed'. I was expecting that for accelerated queries total_bytes_billed should be equal to zero which does not seem to be the case.
So the questions are:
Upvotes: 1
Views: 576
Reputation: 11
Regarding your second question: I noticed that too and discussed it with Google a while ago, please find an excerpt of their answer below:
during join, one side of the join was reading from the table and another side from shuffle. As BI Engine does not support reading from shuffle, those stages are excluded from full/partial reporting.
Currently, this is not correctly represented in the information schema, but it is billed correctly. This is the (public fcticket https://issuetracker.google.com/issues/274899698?pli=1
Upvotes: 1
Reputation: 12818
QUERY
cannot be used for BI Engine, so it's somewhat unfair to keep counting them in.sum(total_bytes_processed) / pow(1024, 4) AS TB_processed
bi_engine_mode='FULL'
queries have savings:SELECT
total_bytes_processed,
total_bytes_billed,
bi_engine_statistics
FROM `my_project_id.region-eu.INFORMATION_SCHEMA.JOBS`
WHERE 1=1
and bi_engine_statistics.bi_engine_mode = 'FULL'
and total_bytes_processed = total_bytes_billed
and total_bytes_processed > 0
Upvotes: 0