Reputation: 1582
There is a lot of documentation on gcp about querying sharded/wildcard tables [1][2], but I can't seem to figure out how to create or insert data into such as table.
Here's a trivial and mostly nonsensical example:
-- STANDARD SQL
CREATE TABLE IF NOT EXISTS `mydataset.mytable_CA` AS
SELECT "CA" as COUNTRY_CODE
CREATE TABLE IF NOT EXISTS `mydataset.mytable_US` AS
SELECT "US" as COUNTRY_CODE
SELECT * FROM `mydataset.mytable_*` WHERE _TABLE_SUFFIX = "CA"
How do I use wildcards in the create statement?
-- STANDARD SQL
-- DOES NOT WORK
CREATE TABLE IF NOT EXISTS `mydataset.mytable_*` AS
SELECT "US" as COUNTRY_CODE WHERE _TABLE_SUFFIX = "US"
Similarly, how do I do inserts?
Is it possible to do something along the lines of the following:
-- STANDARD SQL
-- DOES NOT WORK
INSERT INTO `mydataset.mytable_*` WHERE _TABLE_SUFFIX = COUNTRY_CODE
SELECT * FROM UNNEST(["US", "US", "CA"]) as COUNTRY_CODE
I know it is recommended to use partitioning and clustering, but I am asking specifically about wildcards, so please indulge me :)
Is this possible using standard SQL? If not, how about legacy SQL?
Upvotes: 0
Views: 3165
Reputation: 4085
As you can see here, DML
has some limitations in BigQuery
and one of them is:
Queries that contain Data Manipulation Language (DML) statements cannot use a wildcard table as the target of the query. For example, a wildcard table may be used in the FROM clause of an UPDATE query, but a wildcard table cannot be used as the target of the UPDATE operation.
In other words, you can not use a wildcard to specify a table to be created, updated or to have new records inserted. Despite that, you can use it to specify the source of your date.
Let's take a look in some examples of how to use wildcards with DML and DDL.
INSERTING
If you're inserting data into a wildcard table, you need to insert your data in the final table. In other words, you need to specify the coplete path including the suffix.
INSERT INTO `mydataset.mytable_US` <your query or your values>
If you're using the wildcard table as the source of data for inserting data in a normal table, you can use wildcards in the WHERE
clause:
INSERT INTO `mydataset.mynormaltable`
SELECT <fields>
FROM `mydataset.mytable_*`
WHERE _TABLE_SUFFIX = "US"
CREATING
Again, you can not use wildcard in the target, but you can use it to specify a source of data:
CREATE TABLE IF NOT EXISTS `mydataset.normal_table`
AS SELECT <fields>
FROM `mydataset.mytable_*`
WHERE _TABLE_SUFFIX = "US"
If you want to create a new wildcard table with the same prefix, you can do:
CREATE TABLE IF NOT EXISTS `mydataset.mytable_GB`
AS SELECT <fields>
FROM `mydataset.table`
In this example you created a new suffix named GB. After this, you will be able to query your table with the GB suffix normaly.
Finally, I would like to attach here a reference to DML and DDL for BigQuery
.
I hope it helps
Upvotes: 4