Reputation: 531
I have a google cloud function triggered by a topic pubsub.
Most of the times, everything works fine like this:
2021-11-06 15:12:38.830 GMT vote_stage oga5hymuzx4p Function execution started
2021-11-06 15:13:45.392 GMT vote_stage oga5hymuzx4p Function execution took 66563 ms, finished with status: 'ok'
But unfortunately, once in twenty, i can only see this:
2021-11-06 15:08:45.139 GMT vote_stage rvfnd4wo5nwd Function execution started
How is this possible that there is no exit log ?
Below, the code of the function:
def vote_stage(event, context):
assert context == context
call_datetime = reusable.time.get_now()
game_id = event['attributes']['game_id']
logger.info(f'start, game_id={game_id}')
game = build_game(game_id)
resp = ut.exceptions.ExceptionsHandler(game).handle_vote_stage_exceptions()
if resp:
logger.info(f'exception, game_id={game_id}')
return resp
game.dict['vote_stage_last_trigger'] = reusable.time.get_now()
ut.firestore.FirestoreEditor(game).set_game(merge=True)
while True:
game = build_game(game_id)
ut.slack.SlackOperator(game).update_vote_stage_lower()
c1 = len(game.frozen_guessers) == 1
c2 = game.time_left_to_vote <= 0
c3 = not game.remaining_potential_voters
if c1 or c2 or c3:
game.dict['frozen_voters'] = deepcopy(game.dict['voters'])
game.dict['vote_stage_over'] = True
ut.firestore.FirestoreEditor(game).set_game(merge=True)
game.stage_triggerer.trigger_pre_result_stage()
logger.info(f'pre_result_stage triggered, game_id={game_id}')
return make_response('', 200)
if ut.time.datetime1_minus_datetime2(
reusable.time.get_now(),
call_datetime) > game.self_trigger_threshold:
game.stage_triggerer.trigger_vote_stage()
logger.info(f'vote_stage self-triggered, game_id={game_id}')
return make_response('', 200)
time.sleep(game.refresh_interval)
Here, game.self_trigger_threshold = 60s and game.refresh_interval = 9s. The vote_stage function has a timeout of 540s.
The method game.stage_triggerer.trigger_pre_result_stage() publishes a message to a topic "topic_pre_result_stage" which triggers the "pre_result_stage" cloud function.
The method game.stage_triggerer.trigger_vote_stage() publishes a message to a topic "topic_vote_stage" which triggers the "vote_stage" cloud function.
The only inner log i find is:
2021-11-06 15:08:49.594 GMT vote_stage rvfnd4wo5nwd 2021-11-06 15:08:49,594 - INFO - start, game_id=20211106_150714&T01LTPERV4Z&C02EKL5PQAX&U01LFAZNKPV&2694551026434.1707796879169.ce017050e1e74e97332c2a04d4b19c08
Here is a screenshot (took at 2021-11-07) of the Legacy Logs Viewer showing all the logs of this function execution (with execution_id = rvfnd4wo5nwd):
Here is a screenshot (took at 2021-11-07) of the Legacy Logs Viewer showing all the logs with level = "Errors" of the vote_stage function:
As one can see, none is from 2021-11-06.
Upvotes: 2
Views: 1115
Reputation: 531
I do not have the bugs anymore. I did the following change: i replaced publisher.publish(**kwargs)
by
publisher.publish(**kwargs).result()
This function is used to trigger a pubsub topic.
As described here https://cloud.google.com/functions/docs/troubleshooting : If your function creates asynchronous tasks, it must also explicitly wait for these tasks to complete.
Upvotes: 2
Reputation: 2045
The missing logs entries from Cloud Functions is not something usually expected and could be caused by termination of the process before it reach the phase where the logs are forwarded to Cloud Monitoring. The missing log entry is not typically a concern but could indicate some unbalanced configuration. The premature resource termination can be caused by exhausting some limit. It looks like your function takes significant amount of time.
vote_stage oga5hymuzx4p Function execution took 66563 ms,
finished with status: 'ok'
Where default timeout is 1 minute and it can be extended up to 9 as your function says that it has a timeout of 540s.
In your setup, the function has the timeout close to the limit and it happens that it consumes most of execution time. So it could be possible that logs would not be captured. What could help is to experiment with Cloud Functions configuration and provide more resources but that could not be sufficient as the root cause is not known here.
The best option I could suggest is to reach out to Google Cloud Support to properly check for what is going on in the background and to properly understand the traffic characteristics of the function. GCP has a free trial that you can try and support for it is available as well.
Upvotes: 0