Reputation: 736
I have a successfully running AWS Glue Job that transform data for predictions. I would like to stop processing and output status message (which is working) if I reach a specific condition:
if specific_condition is None:
s3.put_object(Body=json_str, Bucket=output_bucket, Key=json_path )
return None
This produces "SyntaxError: 'return' outside function", I tried:
if specific_condition is None:
s3.put_object(Body=json_str, Bucket=output_bucket, Key=json_path )
job.commit()
This is not running in AWS Lambda, it is Glue Job that gets started using Lambda (e.g., start_job_run()).
Upvotes: 7
Views: 8295
Reputation: 3387
[This answer may not be applicable to latest glue job versions, please refer to Jeremy's answer.]
There's no return in Glue Spark jobs, and job.commit() just signals Glue that the job's task was completed and that's all, script continues its run after that. To end your job after your process is complete, you'll have to:
Please note that, if sys.exit is called before job.commit(), glue job will be failed.
Upvotes: 1
Reputation: 375
If you click on jobs and the click your relevant job you will see a x mark with running in job status.
For reference please check https://forums.aws.amazon.com/thread.jspa?threadID=262217
Upvotes: -2
Reputation: 1900
Since @amsh's solution did not worked for me, I continued to look for a solution and discovered that:
os._exit()
terminates immediately at the C level and does not perform any of the normal tear-downs of the interpreter.
Thanks to @Glyph's answer! You can then proceed this way:
if specific_condition is None:
s3.put_object(Body=json_str, Bucket=output_bucket, Key=json_path )
job.commit()
os._exit()
Your job will succeed and not terminates with a "SystemExit: 0" error.
Upvotes: 6