Reputation: 135
I've started to work with remote functions: https://cloud.google.com/bigquery/docs/reference/standard-sql/remote-functions
I've been able to setup a cloud function & call it with BigQuery. But no more than 60 instances of this cloud function are active at the same time, while the maximum is set to 3000.
This small number of instance seems not to be impacted by changing max_batching_rows
nor the number of rows on which the function is called.
Configuration of the cloud function:
Graph showing the small number of instances active:
Variation over time are due to successive test with various load.
Code of the cloud function:
A delay of 10s has been added for each call, it matches the time my processing will take.
import json
import time
import uuid
def add_fake_user(request):
request_json = request.get_json(silent=True)
replies = []
calls = request_json['calls']
call_id = str(uuid.uuid4())
for call in calls:
time.sleep(10)
userno = call[0]
corp = call[1]
replies.append({
'username': f'user_{userno}',
'email': f'user_{userno}@{corp}.com',
'n_call': len(calls),
'call_id': call_id
})
return json.dumps({
# each reply is a STRING (JSON not currently supported)
'replies': [json.dumps(reply) for reply in replies]
})
configuration of the remote function:
CREATE OR REPLACE FUNCTION `PROJECT_NAME`.trash.add_fake_user(user_id int64, corp_id STRING) RETURNS STRING
REMOTE WITH CONNECTION `PROJECT_NAME.eu.gcf-conn` OPTIONS (endpoint = 'my_url', max_batching_rows=1)
Query calling the remote function
SELECT
`PROJECT_NAME`.trash.add_fake_user(var1, var2) AS foo
FROM
base
I've created an issue on Google's issue tracker: https://issuetracker.google.com/issues/235252503
Upvotes: 2
Views: 784