Reputation: 13507
An api server is running on Kubernetes Engine (GKE). Users can upload relatively small sets of data (~100mb, multiple .csv with the same data structure) from client applications to Cloud Storage (GCS). Once upload is complete, i need to import all data from all new .csv files to a single existing BigQuery table with some user-specific params (mark each row with user id may be or so). Order doesn't matter.
Google docs are offering GUI-based solutions and command line solutions for this. Though, i assume, there is a way to trigger upload and track it's progress from the GKE-based server itself. How do i do that?
Not sure if this is important: GKE api server is written on NodeJS.
Upvotes: 0
Views: 938
Reputation: 33705
Here is an example of uploading a file to GCS, taken from the BigQuery documentation. You can configure the job as you need; there are a few references on that page and a link to the GitHub repo with additional functionality:
// Imports the Google Cloud client libraries
const BigQuery = require('@google-cloud/bigquery');
const Storage = require('@google-cloud/storage');
// The project ID to use, e.g. "your-project-id"
// const projectId = "your-project-id";
// The ID of the dataset of the table into which data should be imported, e.g. "my_dataset"
// const datasetId = "my_dataset";
// The ID of the table into which data should be imported, e.g. "my_table"
// const tableId = "my_table";
// The name of the Google Cloud Storage bucket where the file is located, e.g. "my-bucket"
// const bucketName = "my-bucket";
// The name of the file from which data should be imported, e.g. "file.csv"
// const filename = "file.csv";
// Instantiates clients
const bigquery = BigQuery({
projectId: projectId
});
const storage = Storage({
projectId: projectId
});
let job;
// Imports data from a Google Cloud Storage file into the table
bigquery
.dataset(datasetId)
.table(tableId)
.import(storage.bucket(bucketName).file(filename))
.then((results) => {
job = results[0];
console.log(`Job ${job.id} started.`);
// Wait for the job to finish
return job.promise();
})
.then((results) => {
// Get the job's status
return job.getMetadata();
}).then((metadata) => {
// Check the job's status for errors
const errors = metadata[0].status.errors;
if (errors && errors.length > 0) {
throw errors;
}
}).then(() => {
console.log(`Job ${job.id} completed.`);
})
.catch((err) => {
console.error('ERROR:', err);
});
After uploading, you can run a query that queries the newly uploaded CSV file(s) and appends the result to the desired destination table.
Upvotes: 1