Crazy Otto
Crazy Otto

Reputation: 135

Python 2.7, Umlauts, UTF-8 and Lists

I'm trying to use Python 2.7 to replace German umlauts in a bunch of file names with other characters. I'm using the following code to get a list of all the files with umlauts in their names:

# -*- coding: utf-8 -*-

import os

def GetFilepaths_umlaut(directory):
    file_paths = [] 
    umlauts = ["Ä", "Ü", "Ö", "ä", "ö", "ü"]
    for root, directories, files in os.walk(directory):
        for filename in files:
            filepath = os.path.join(root, filename)
            if any(umlaut in filepath for umlaut in filepath):
                file_paths.append(filepath)
    print file_paths
    return file_paths

GetFilepaths_umlaut(r'C:\Scripts\Replace Characters\Umlauts')

But when the list is printed to the console, it's not printing the umlauts (see screenshot). I've tried using encode() but am getting the error shown in the second screenshot. What am I doing wrong? Any feedback is greatly appreciated!

enter image description here

With encode() on filepath: enter image description here

Upvotes: 3

Views: 1334

Answers (1)

mhawke
mhawke

Reputation: 87134

print file_paths is printing a list, not a string. It is up to the list object's str() and/or unicode() methods as to how the output is displayed. In this case is prints the elements of the list using escaped strings:

>>> s = u'a\xe4a'
>>> s
u'a\xe4a'
>>> print s
aäa
>>> [s]
[u'a\xe4a']
>>> print [s]
[u'a\xe4a']

To print the actual strings:

for s in file_paths:
    print s

Upvotes: 1

Related Questions