Google BigQuery: SELECT only from newest tables

Question

I have Big Query tables like below, and like to issue a query to the newest tables for each group (marked <=)

prefix_AAAAAAA_20170329012345
             :
prefix_AAAAAAA_20170413032333
prefix_AAAAAAA_20170413112205 <=
prefix_BBBBBBB_20170327233456
             :
prefix_BBBBBBB_20170414012244
prefix_BBBBBBB_20170414130509 <=
prefix_CCCCCCC_20170327154412
             :
prefix_CCCCCCC_20170414090022
prefix_CCCCCCC_20170414091530 <=

like my previous question, I tried to use TABLE_QUERY. But I could not find the way.

How can I do this? Or Can I do this with wildcard table in Standard SQL?

Mikhail Berlyant · Accepted Answer

Try below

#standardSQL
SELECT *
FROM `yourProject.yourDataset.prefix_*`
WHERE _TABLE_SUFFIX IN (
  SELECT SUBSTR(CONCAT(t, d), LENGTH('prefix_') + 1) AS tbl
  FROM (
    SELECT SUBSTR(table_id, 1, LENGTH(table_id) - 14) AS t, MAX(SUBSTR(table_id, -14)) AS d
    FROM `yourProject.yourDataset.__TABLES__`
    WHERE SUBSTR(table_id, 1, LENGTH('prefix_')) = 'prefix_' AND LENGTH(table_id) > 14
    GROUP BY t
  )
)

Note: number 14 in above query comes from length of datetime related suffix in your tables

prefix_AAAAAAA_20170329012345

Google BigQuery: SELECT only from newest tables

Answers (1)

Related Questions