Reputation: 317
We have an ETL script which reads the data form catalogue and writes in s3 as parquet. We're also calling a crawler to create/update the tables in Athena. However, it is creating table but adding some weird suffix to the table name.
All the files in the folder that I'm crawling are in parquet with the same schema. Also this is happening only when we're calling the crawler from the ETL script.
The script we used to call the crawler
glue_client = boto3.client("glue", region_name=args.get("aws_region"))
glue_client.start_crawler(Name=args["crawler_name"])
Expected: table_name Actual: table_name_31e198c8c61861f127ae06487eb14a3f
Upvotes: 2
Views: 4156
Reputation: 5124
This happens when ever Glue crawler encounters a duplicate table name in the Glue data catalogue. Refer to this doc which talks about this behaviour :
If duplicate table names are encountered, the crawler adds a hash string suffix to the name.
Upvotes: 3