user1010775
user1010775

Reputation: 371

Replace Specialchars in Python

i need to replace special chars in the filename. Im trying this at the moment with translate, but its not really good working, and i hope you got an idea to do this. Its to make an clear playlist, ive got an bad player of mp3s in my car which cant do umlaute oder specialchars.

My code so far

# -*- coding: utf-8 -*-
import os
import sys
import id3reader
pfad = os.path.dirname(sys.argv[1])+"/"
ordner = ""

table = {
      0xe9: u'e',
      0xe4: u'ae',
      ord(u'ö'): u'oe',
      ord(u'ü'): u'ue',
      ord(u'ß'): u'ss',
      0xe1: u'ss',
      0xfc: u'ue',
    }
def replace(s):
return ''.join(c for c in s if (c.isalpha() or c == " " or c =="-") )
fobj_in = open(sys.argv[1])
fobj_out = open(sys.argv[1]+".new","w")

for line in fobj_in:
if (line.rstrip()[0:1]=="#" or line.rstrip()[0:1] ==" "):
    print line.rstrip()[0:1]
else:
    datei= pfad+line.rstrip()
    #print datei
    id3info = id3reader.Reader(datei)
    dateiname= str(id3info.getValue('performer'))+" - "+ str(id3info.getValue('title'))
    #print dateiname
    arrPfad = line.split('/')

    dateiname = replace(dateiname[0:60])
    print dateiname
#   dateiname = dateiname.translate(table)+".mp3"
    ordner = arrPfad[0]+"/"+dateiname
#   os.rename(datei,pfad+ordner)
    fobj_out.write(ordner+"\r\n")
fobj_in.close()

i get this error: UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 37: ordinal not in range(128) If i try to use the translate at the id3title i get TypeError: expected a character buffer object

Upvotes: 0

Views: 555

Answers (1)

ch3ka
ch3ka

Reputation: 12158

if I need to get rid of non-ascii-characters, I often use:

>>> unicodedata.normalize("NFKD", u"spëcïälchärs").encode('ascii', 'ignore')
'specialchars'

which tries to convert characters to their ascii part of their normalized unicode decomposition. Bad thing is, it throws away everything it does not know, and is not smart enough to transliterate umlauts (to ue, ae, etc).

But it might help you to at least play those mp3s.

Of course, you are free to do your own str.translate first, and wrap the result in this, to eliminate every non-ascii-character still left. In fact, if your replace is correct, this will solve your problem. I'd suggest you'd take a look on str.translate and str.maketrans, though.

Upvotes: 1

Related Questions