Special characters in python

Question

I have a file with a lot of entries about Nobel prizes. I than convert that file into a list like this:

file = open(path, 'r')
file.readline()
content = []
for line in file:
    line = line.replace('
', '')
    content.append(line.split(';'))

content = check(content, 'röntgen')

After that I have a function that takes that list and a other argument and checks if the list contains that argument. However if the argument takes a special character like the Ö it doen’t work because when the file is read python saves it like: Ã¶

def check(content, attr):
reducedList = []
for i in range(len(content)):
    curr = content[i][4]
    if curr.find(attr) != -1:
        reducedList.append(content[i])
return reducedList

with:

curr = 'voor hun verdiensten op het gebied van de analyse van de kristalstructuur door middel van rÃ¶ntgenstraling'
attr = 'röntgen'

I have tried converting it with utf-8 but that doesn’t seem to help. Does anyone have a solution?

job vink · Accepted Answer

The solution is to replace open(path,’r’,) with open(path,’r’,encodeing=’utf-8’) If you add de encodeing parameter python will make sure de file is read in utf-8 so when you compare the strings they are truly the same.

Special characters in python

Answers (2)

Related Questions