FreddieBL
FreddieBL

Reputation: 25

how to create a glue search script in python

so I have been asked to write a python script that pulls out all the Glue databases in our aws account, and then lists all the tables and partitions in the database in a CSV file? Its acceptable for it to just run on desktop for now, would really love some guidance on how to do this/direction on how to go about this as I'm a new junior and would like to explore my options before going back to my manager

format: layout of csv file

Upvotes: 1

Views: 254

Answers (1)

crazyPen
crazyPen

Reputation: 873

Can be easily done using Boto3 - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client

I'll start it off for you and you can figure out the rest.

import boto3
glue_client = boto3.client('glue')
db_name_list = [db['Name'] for db in glue_client.get_databases()['DatabaseList']]
    

I haven't tested this code but it should create a list of all names of your databases. From here you can then use this information to run nested loops to get your tables get_tables(DatabaseName= ...) and then next your partitions get_partitions(DatabaseName=...,TableName=...).

Make sure to read the documentation to double check the arguments youre providing are correct.

EDIT: You will also likely need to use a paginator if you have a large amount of values to be returned. Best practice would be to use the paginator for all three calls which would just mean an additional loop at each step. Documentation about paginator is here - https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Paginator.GetDatabases

And there is plenty of stackoverflow examples on how to use it.

Upvotes: 1

Related Questions