Reputation: 431
I have problem with comparing string from file with string I entered in the program, I should get that they are equal but no matter if i use decode('utf-8') I get that they are not equal. Here's the code:
final = open("info", 'r')
exported = open("final",'w')
lines = final.readlines()
for line in lines:
if line == "Wykształcenie i praca": #error
print "ok"
and how I save file that I try read:
comm_p = bs4.BeautifulSoup(comm)
comm_f.write(comm_p.prettify().encode('utf-8'))
for string in comm_p.strings:
#print repr(string).encode('utf-8')
save = string.encode('utf-8') # there is how i save
info.write(save)
info.write("\n")
info.close()
and at the top of file I have # -- coding: utf-8 --
Any ideas?
Upvotes: 1
Views: 333
Reputation: 9632
use unicode for string comparision
>>> s = u'Wykształcenie i praca'
>>> s == u'Wykształcenie i praca'
True
>>>
when it comes to string unicode is the smartest move :)
Upvotes: 0
Reputation: 174708
This should do what you need:
# -- coding: utf-8 --
import io
with io.open('info', encoding='utf-8') as final:
lines = final.readlines()
for line in lines:
if line.strip() == u"Wykształcenie i praca": #error
print "ok"
You need to open the file with the right encoding, and since your string is not ascii, you should mark it as unicode.
Upvotes: 3
Reputation: 8372
It is likely the difference is in a '\n' character
readlines doesn't strip '\n' - see Best method for reading newline delimited files in Python and discarding the newlines?
In general it is not a good idea to put a Unicode string in your code, it would be a good idea to read it from a resource file
Upvotes: 0
Reputation: 336468
First, you need some basic knowledge about encodings. This is a good place to start. You don't have to read everything right now, but try to get as far as you can.
About your current problem:
You're reading a UTF-8 encoded file (probably), but you're reading it as an ASCII file. open()
doesn't do any conversion for you.
So what you need to do (at least):
codecs.open("info", "r", encoding="utf-8")
to read the fileif line.rstrip() == u"Wykształcenie i praca":
Upvotes: 0