Reputation: 7310
I have a list of article titles that I store in a text file and load into a list. I'm trying to compare the current title with all the titles that are in that list like so
def duplicate(entry):
for line in posted_titles:
print 'Comparing'
print entry.title
print line
if line.lower() == entry.title.lower()
print 'found duplicate'
return True
return False
My problem is, this never returns true. When it prints out identical strings for entry.title
and line
, it won't flag them as equal. Is there a string compare method or something I should be using?
Edit
After looking at the representation of the strings, repr(line)
the strings that are being compared look like this:
u"Some Article Title About Things And Stuff - Publisher Name"
'Some Article Title About Things And Stuff - Publisher Name'
Upvotes: 0
Views: 264
Reputation: 388313
It would help even more if you would have provided an actual example.
In any way, your problem is the different string encoding in Python 2. entry.title
is apparently a unicode string (denoted by a u
before the quotes), while line
is a normal str
(or vice-versa).
For all characters that are equally represented in both formats (ASCII characters and probably a few more), the equality comparison will be successful. For other characters it won’t:
>>> 'Ä' == u'Ä'
False
When doing the comparison in the reversed order, IDLE actually gives a warning here:
>>> u'Ä' == 'Ä'
Warning (from warnings module):
File "__main__", line 1
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
False
You can get a unicode string from a normal string by using str.decode
and supplying the original encoding. For example latin1
in my IDLE:
>>> 'Ä'.decode('latin1')
u'\xc4'
>>> 'Ä'.decode('latin1') == u'Ä'
True
If you know it’s utf-8, you could also specify that. For example the following file saved with utf-8 will also print True:
# -*- coding: utf-8 -*-
print('Ä'.decode('utf-8') == u'Ä')
Upvotes: 1
Reputation: 26333
==
is fine for string comparison. Make sure you are dealing with strings
if str(line).lower() == str(entry.title).lower()
other possible syntax is boolean expression str1 is str2
.
Upvotes: 0