Reputation: 1133
I am trying to create job script using Java. In AWS Glue Console, I could be able to find only "Python, Spark", so which means we cant write script using Java at all? If yes, then whats this api used for: aws-java-sdk-glue
I even found some example: https://stackoverflow.com/questions/48256281/how-to-read-aws-glue-data-catalog-table-schemas-programmatically
In above, seems like we can able to write aws glue script in Java too. Can anyone please confirm this?
EDIT:
In Scala, we are writing as: glueContext.getCatalogSource(database = "my_data_base", tableName = "my_table")
In Java, I found below class, which has method names: withDatabaseName
and withTableName
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/glue/model/CatalogEntry.html
Then, may I know what is the purpose of above class?
Upvotes: 3
Views: 3702
Reputation: 5124
The language option on the Glue console that you see is the script/code that yoiu will write to extract, transform and load the actual data that needs to be processed. The source can be a db or s3 bucket and destination can be anything depending on your use case.
Normally you can create a Glue job or a S3 bucket from AWS Management console and when you don't want to do this manually then you need a SDK which has the API call definitions that you use to create AWS resources.
So the script inside a Glue job can be written only in python or scala but when it comes to creating a Glue job you can use different languages/SDKs.
Java - https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/glue/AWSGlueClient.html
Python - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html
Java script - https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Glue.html
Ruby - https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Glue/Client.html
Above all are SDKs used to define resources in AWS where as refer to below link which has the actual code used inside a Glue job.
https://github.com/aws-samples/aws-glue-samples
Upvotes: 2
Reputation: 8097
Java is not supported for the actual script definition of AWS Glue jobs.
The API that you are referring to is the AWS SDK that will allow you to create and manage AWS Glue resources such as creating/running crawlers, viewing and manage the glue catalogues, creating job definitions, etc.
So you can manage resources in the Glue service with the AWS SDK for Java similar to how to you manage resources in EC2, S3, RDS with the AWS SDK for Java.
Upvotes: 1