Reputation: 243
I would like to avoid AWS dev endpoint. Is there a way where I can test and debug my PySpark code without using AWS dev endpoint with the help of testing my code in local notebook/IDE?
Upvotes: 2
Views: 4054
Reputation: 133
I was able to test without dev endpoints
Please follow the instructions here https://support.wharton.upenn.edu/help/glue-debugging
Upvotes: 0
Reputation: 415
As others have said, it depends on which part of the Glue are you going to use. If your code is based on pure Spark, without the Dynamic Frames etc. Then local version of Spark may suffice, if however you are intending on using Glue extensions, there is not really an option of not using the Dev End point at this stage.
I hope that this helps.
Upvotes: 1
Reputation: 3173
If you are going to deploy your pyspark code on AWS Glue service, you may have to use GlueContext & other AWS Glue APIs. So if you would like to test against AWS Glue service, using these AWS Glue APIs then you have to have an AWS Dev Endpoint.
However having a AWS Glue notebook is optional, since you can setup zeppelin, etc. establish an ssh tunnel connection with AWS Glue DEP for dev / testing from local env. Make sure you delete the DEPoint once your development/testing is done for the day.
Alternately, if you are not keen on using AWS Glue APIs other than GlueContext, then yes, you can setup zeppelin in local environment, test the code locally and then upload your code to S3, create a Glue job for testing in AWS Glue Service
Upvotes: 0
Reputation: 2144
We use pytest to test pyspark code. We keep pyspark code in another file and calls those functions inglue code file. With this separation, we can unit test pyspark code using pytest
Upvotes: 0
Reputation: 1539
We have a setup here, where we have pyspark install locally and we use VSCode to develop our pyspark codes, unit test, and debug. We run the codes against the local pyspark installation during development, then we deploy those codes to EMR to run with real dataset.
I'm not sure how much of this apply to what you're trying to do with Glue, as it's a level higher in abstraction.
Upvotes: 0