Reputation: 15
Good morning,
I'm developing a cloud function that when someone deposits a file in a bucket, the function moves it to another bucket, performs an insert in BigQuery (getting info from another table with info from the file) and deletes the file from the original bucket.
The problem is that usually about 100 files are deposited at once and this causes the function to fail because too many insertions are made in BigQuery and this causes that the insert from that file is not inserted and the file is not deleted from the original bucket.
The function do something like this:
async function archive(data, context) {
const file = data;
console.log(`Detected new or overwritten file 'gs://${file.bucket}/${file.name}'`); // 'finalize' GCS trigger
const filepath = file.name.split('/');
const filename = filepath.slice(-1).join('/');
const filenameWithoutExtension = filename.split('.').slice(0, -1).join('.')
const filenameData = filenameWithoutExtension.split('_');
const fileData = { // exemple : 0023_ROMERO_2021_02.xml
id_file: filenameData[0],
name: filenameData[1],
year: filenameData[3],
month: filenameData[4]
};
await copyFile(srcBucketName, srcFilename, destBucketName,...)
const [job] = await bigquery.createQueryJob({
query: `INSERT INTO table1 (....)
SELECT
...
timestamp('${file.timeCreated}')
FROM
table2
WHERE
month = '${file.month}'
and year = '${file.year}'
and id_file = '${file.id_file}'
`,
location: 'EU',
});
console.log(`BigQuery job ${job.id} started.`);
await job.getQueryResults();
console.log(`BigQuery job ${job.id} completed.`);
await deleteFile(file.bucket, file.name);
console.log('Completed');
}
There is a way to solve this problem?
Thanks in advance
Upvotes: 0
Views: 148
Reputation: 699
Bigquery has limitations, which we cannot over come it, but you can customize your to handle it.
Solution Proto type:-
Upvotes: 1