Reputation: 1911
While performing a substring match, I get UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 8: ordinal not in range(128)
Code:
for bhk in bed_bath:
if "Bedroom" in bhk.text or "Chambre à coucher" in bhk.text or "Slaapkamer" in bhk.text:
bhk_count += 1
How do I resolve it?
I have included below lines on the beginning of my file.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
Upvotes: 1
Views: 73
Reputation: 4118
I'm assuming you are using python 2.
The problem is happening because bhk.text
is a unicode string.
When you do a comparison like "Chambre à coucher" in bhk.text
the literal string, which is an non-unicode strings needs to be converted to a unicode string.
Since you declared your file to have a utf-8 encoding, the unicode char à
is encoded as string "\xc3\xa0"
.
When python tries to convert char "0xc3" using the default codec (ascii
), it cannot map it to a unicode char and throws that error.
The solution would be to declare the strings with non-ascii characters as unicode, like:
u"Chambre à coucher" in bhk.text
Upvotes: 3