Reputation: 354
Python version: 3.7.3
Resource: Azure function
Azure plan: Consumption
Goal: Improve the speed of an azure function
Hi everyone,
I have the following code to classify mails:
init.py
### Importing libraries above
def main(req: func.HttpRequest,
context: func.Context) -> func.HttpResponse:
try:
req_body = req.get_json()
except ValueError:
pass
if req_body:
try:
### Some code in between to load the variables classes, selected models, .... Those variables are JSOn objects
prediction = cfp.predict(req_body['text_original'],
req_body['text_cleaned'],
selected_models, thresholds)
return func.HttpResponse(json.dumps(prediction).encode('utf-8'),
status_code = 200,
mimetype = 'application/json')
except Exception as e:
return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': str(e), 'stack_trace': traceback.format_exc()}),
status_code = 400,
mimetype = 'application/json')
else:
return func.HttpResponse(json.dumps({'status': 'fehler', 'comment': 'Die Eingabedaten wurden falsch angegeben', 'stack_trace': ''}),
status_code = 400,
mimetype = 'application/json')
cfp.py
def classify_mail(model_typ, scenario_name, X, vectorizer_parameters, modelFolderPath):
### ... some code in between
model = joblib.load(modelFolderPath)
vec = TfidfVectorizer(**vectorizer_parameters)
X_features = vec.fit_transform(X)
result['prediction'] = model.predict(X_features)[0]
return result
def predict(mail, mail_cleaned,
selected_models, thresholds, vectorizer_parameters):
model_folder_path = "model"
prediction={}
results = []
for m in selected_models.keys():
for s in selected_models[m]['scenarios']:
result = {}
result['name'] = m + '_' + s
X = [mail_cleaned]
selected_vectorizer_parameters = vectorizer_parameters[s]
result.update(classify_mail(m,s,X, selected_vectorizer_parameters, model_folder_path))
results.append(result)
### some code after
return prediction
The method predict call 10 times the method classify_mail (This is the two for-loops). Each call lasts 20 seconds and I would like to know how to call the method classify in parallel for those 10 times and I can reduce the time of the execution of my azure function. I am getting the following error because of the timeout of the function in a consumption plan:
BadRequest. Http request failed: the server did not respond within the timeout limit. Please see logic app limits at https://aka.ms/logic-apps-limits-and-config#http-limits.
Update 1:
I found this resource Async method with python. However, it is not clear for me how to implement it in my specific use case.
Upvotes: 1
Views: 560
Reputation: 8234
I would like to know how to call the method classify in parallel for those 10 times and I can reduce the time of the execution of my azure function.
One of the workarounds is that you can Set a max degree of parallelism for the function where the execution time will be less than usual by performing the tasks in parallel.
REFERENCES:
Upvotes: 1