Reputation: 5085
Any idea why I am getting a length of 6 instead of 5? I created a file called björn-100.png and ran the code using python3:
import os
for f in os.listdir("."):
p = f.find("-")
name = f[:p]
print("name")
print(name)
length = len(name)
print(length)
for a in name:
print(a)
prints out the following:
name
björn
6
b
j
o
̈
r
n
instead of printing out
name
björn
5
b
j
ö
r
n
Upvotes: 1
Views: 438
Reputation: 106543
If you're using python 2.7, you can simply decode the file name as UTF-8 first:
length = len(name.decode('utf-8'))
But since you're using python 3 and can't simply decode a string
as if it were a bytearray
, I recommend using unicodedata
to normalize the string.
import unicodedata
length = len(unicodedata.normalize('NFC', name))
Upvotes: 3
Reputation: 5085
The way to get the correct string with the two dots inside the o char is:
import unicodedata
name = unicodedata.normalize('NFC', name)
Upvotes: 1