Reputation: 32653

How to compare a Unicode string to an lxml element and a simple string?

This is my code:

    for name in doc_preparate.cssselect('.dbl1:first-child'):
        if name.text != u"Продукция":
            print name.text

I don't know why it doesn't work. Here is the result:

Артрозан
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
Пенталгин
Продукция
...

P.S.

I tried this:

    for name in doc_preparate.cssselect('.dbl1:first-child'):
        print type(name.text)
        if u"Продукция" not in name.text:
            print name.text

But it is not working either :(

How can I fix this problem?

Upvotes: 0

Answers (2)

Spencer Rathbun

Reputation: 14900

Probably because you are trying to do a string comparison with the equals sign. This has hidden issues, namely strings are lists of characters. This is more obvious in c, where if you compare to strings, you get bad results because you are comparing the pointer of your first string to the pointer of your second string.

Python is clever enough to use a more obvious comparison operator, but if your strings are not absolutely identical, then it will return false. If your data is whitespace padded to a certain number of characters, your strings will be different internally.

whitespace = 'Python   '
str = 'Python'

These do not evaluate the same. To see if your string is contained by the input, use

str in whitespace

But note that this will return true for

'Python' in 'Python    '
'Python' in 'PythonAnd other stuff   '

Check the python docs on strings for more info and alternate methods.

Upvotes: 2

0x6adb015

Reputation: 7801

Check the type of name.text.

Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "allo"
>>> b= u"allo"
>>> type(a)
<type 'str'>
>>> type(b)
<type 'unicode'>
>>>

Make sure that the type of name.text is unicode as well. In Python 3, all strings are unicode.

Upvotes: 0

How to compare a Unicode string to an lxml element and a simple string?

Answers (2)

Related Questions