Reputation: 2788
I think that what I want to do is not feasible at the moment but want to clarify.
I have a bucket say bucketA
with files served to the public and a bucket say bucketB
where access logs of bucketA
are stored in a specific CSV format
What I want to do is to run SQL queries to these access logs. The problem that I have is that the logs are stored in different CSVs (one per hour I think). I tried to import them through BigQuery UI interface but it seems that there is a one to one CSV to table mapping. When you define the input location the placeholder and documentation as you to put a gs://<bucket_name>/<path_to_input_file>
.
Based on the above my question is: Is it possible to upload a all files in a bucket to a single BigQuery table, with something like an "*" asterisk operator?
Once the table is constructed what happens when more files with data get stored in the bucket? Do I need to re-run, is there a scheduler?
Upvotes: 1
Views: 3319
Reputation: 803
Using BigQuery web UI, after I have created the new table + some initial data with the standard upload csv method.
For quick testing, how to use BigQuery web UI to insert more new data into the existing table?
I realized I CANNOT copy and paste multiple insert statements in the Query editor textbox.
INSERT INTO dataset.myschema VALUES ('new value1', 'more value1');
INSERT INTO dataset.myschema VALUES ('new value2', 'more value2');
Wow, then it will be tedious to insert new line of data 1 by 1.
Luckily BigQuery supports INSERT statements that use VALUES syntax can insert multiple rows
.
INSERT INTO dataset.myschema VALUES ('new value1', 'more value1'),
('new value2', 'more value2');
Upvotes: 0
Reputation: 14781
Based on the above my question is: Is it possible to upload a all files in a bucket to a single BigQuery table, with something like an "*" asterisk operator?
You can query them directly in GCS (federated source) or load then all into a native table using *
in both cases:
Once the table is constructed what happens when more files with data get stored in the bucket? Do I need to re-run, is there a scheduler?
If you leave it as en external
table, then each time you query BigQuery will scan all the files, so you'll get new files/data. If you load it as a native table, then you'll need to schedule a job yourself to append each new file to your table.
Upvotes: 2