Hemant
Hemant

Reputation: 94

Adding up header information while querying detailed information

I have data as this way. This is sample Teradata logs where CPU and IO is captured at QueryID level. I have parsed the querytext corresponding to the QueryID to further identify the database and tables referred within it. While I have parsed the query and broken into detailed, I cannot divide the CPU and IO captured at detailed level. They are header level attribute of that query

I am now displaying the data on Datastudio. The moment I take up DatabaseReferred or TablesReferred fields on the dashboard to get me the distinct counts of tables referred in that query, the CPU and IO gets duplicated as it internally UNNESTs the data and when I sum it up, it blows up

Can you give me an idea on how can I sum CPU only once per query, while still counting the distinct DatabaseReferred and TablesReferred in that query

Input data is as follows

Row    Username   QueryId    CPU    IO    DatabaseReferred TablesReferred 
1)     ABC        1234       100    123   DB1              TB1
                                          DB2              TB2
                                          DB1              TB3
2)     ABC        8454       589    565   DB1              TB3
                                          DB2              TB6
3)     ABC        3564       145    243   DB3              TB4
                                          DB5              TB3
4)     PQR        6352       737    562   DB2              TB6
                                          DB1              TB7
                                          DB1              TB2
5)     PQR        2345       200    126   DB2              TB5
                                          DB1              TB1


The output am looking is like below.

Username  Count(DistinctQueryID)  Sum(CPU)  SUM(IO)  DistinctDatabaseReferred DistinctTablesReferred 
ABC          3                     834       931           4                         5
PQR          2                     937       688           2                         5

For quick reference, I am preparing the WITH clause for input data to be used in the solution

SELECT 'ABC' username, cast('1234' as int64) QueryID, cast('100' as int64) CPU, cast('123' as int64) IO, ['DB1','DB2','DB1'] DatabaseReferred, ['TB1','TB2','TB3'] TablesReferred 
  UNION ALL
  SELECT 'ABC' username, cast('8454' as int64) QueryID, cast('589' as int64) CPU, cast('565' as int64) IO, ['DB1','DB2'] DatabaseReferred, ['TB3','TB6'] TablesReferred 
  UNION ALL
  SELECT 'ABC' username, cast('3564' as int64) QueryID, cast('145' as int64) CPU, cast('243' as int64) IO, ['DB3','DB5'] DatabaseReferred, ['TB4','TB3'] TablesReferred 
  UNION ALL
  SELECT 'PQR' username, cast('6352' as int64) QueryID, cast('737' as int64) CPU, cast('562' as int64) IO, ['DB2','DB1','DB1'] DatabaseReferred, ['TB6','TB7','TB2'] TablesReferred 
  UNION ALL
  SELECT 'PQR' username, cast('2345' as int64) QueryID, cast('200' as int64) CPU, cast('126' as int64) IO, ['DB2','DB1'] DatabaseReferred, ['TB5','TB1'] TablesReferred 

Upvotes: 1

Views: 35

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173036

Below is for BigQuery Standard SQL

#standardSQL
SELECT 
  Username, 
  Count_of_Distinct_QueryId, 
  Sum_CPU, 
  Sum_IO,
  (SELECT COUNT(DISTINCT db) FROM t.dbs AS db) AS DistinctDatabaseReferred,
  (SELECT COUNT(DISTINCT tbl) FROM t.tbls AS tbl) AS DistinctTablesReferred
FROM (
  SELECT Username,
    COUNT(DISTINCT QueryId) AS Count_of_Distinct_QueryId,
    SUM(CPU) AS Sum_CPU,
    SUM(IO) AS Sum_IO,
    ARRAY_CONCAT_AGG(DatabaseReferred) dbs,
    ARRAY_CONCAT_AGG(TablesReferred) tbls
  FROM `project.dataset.table`
  GROUP BY Username
) t   

If to apply to sample data from your question - output is

Row Username    Count_of_Distinct_QueryId   Sum_CPU Sum_IO  DistinctDatabaseReferred    DistinctTablesReferred   
1   ABC         3                           834     931     4                           5    
2   PQR         2                           937     688     2                           5   

Upvotes: 1

Related Questions