Reputation: 13101
I am using AWS Glue Crawlers to read from S3 zip files (without header) and populate Glue Catalog.
Columns are named by default: col_0
, col_1
...
How to change those column names using e.g. python boto3 module and interact with AWS Glue catalog directly?
Is there example snippet for doing this?
Thanks.
Upvotes: 2
Views: 5049
Reputation: 817
You can try pulling the tables and updating the names. Here is an example of what I would do.
First we'll try and retrieve the table:
database_name = 'ENTER TABLE NAME'
table_name = 'ENTER TABLE NAME'
response = self.glue_client.get_table(DatabaseName=database_name,table_name=Name)
old_table = response['Table']
Next we'll update the table with the values we want changed. The new table we create can only have certain fields in order for the update_table to accept it. So we'll do the following.
field_names = [
"Name",
"Description",
"Owner",
"LastAccessTime",
"LastAnalyzedTime",
"Retention",
"StorageDescriptor",
"PartitionKeys",
"ViewOriginalText",
"ViewExpandedText",
"TableType",
"Parameters"
]
new_table = dict()
for key in field_names:
if key in old_table:
new_table[key] = old_table[key]
Now that we have the updated table, we can manipulate the column names. Here is an example of changing just 'col_0' to 'new_col'
for col in new_table['StorageDescriptor']['Columns']:
if col['Name'] == 'col_0':
col['Name'] = 'new_col'
response=self.glue_client.update_table(DatabaseName=database_name,TableInput=new_table)
Hopefully this helps!
Upvotes: 7