Reputation: 63
I'm trying to rename files using the below script, but I'm having problems with catching the following "Don’t" which should end up as "Don't". Any ideas on how I can do this?
def remove_accents(s):
nkfd_form = unicodedata.normalize('NFKD', s)
return u''.join([c for c in nkfd_form if not unicodedata.combining(c)])
for fname in glob.glob("**/*.mp3", recursive=True):
new_fname = remove_accents(fname)
if new_fname != fname:
try:
print ('renaming non-ascii filename to', new_fname)
os.rename(fname, new_fname)
except Exception as e:
print (e)
Upvotes: 1
Views: 489
Reputation: 362786
Wrong tool for the job - unicodedata.normalize
is not about removing accents at all.
For down-converting to ascii, look instead at unidecode
:
>>> from unidecode import unidecode
>>> unidecode("Don’t")
"Don't"
Upvotes: 3