UnicodeDecodeError: 'ascii' codec can't encode character u'\u2019'

Question

I keep getting the below error and can't seem to get .encode('ascii', errors='ignore') to work.

eqs = soup.find_all('div', {'style': 'margin:7px 5px 0px;vertical-align:top;text-align:center;display:inline-block;line-height:normal;width:120px;'})

for equipment in eqs:
    if '#b0c3d9' in str(equipment):
        f2.write(equipment.getText() + ', Common
')
    if '#5e98d9' in str(equipment):
        f2.write(equipment.getText() + ', Uncommon
')
    if '#4b69ff' in str(equipment):
        f2.write(equipment.getText() + ', Rare
')
    if '#8847ff' in str(equipment):
        f2.write(equipment.getText() + ', Mythical
')
    if '#b28a33' in str(equipment):
        f2.write(equipment.getText() + ', Immortal
')
    if '#d32ce6' in str(equipment):
        f2.write(equipment.getText() + ', Legendary
')
    if '#eb4b4b' in str(equipment):
        f2.write(equipment.getText() + ', Ancient
')
    if '#ade55c' in str(equipment):
        f2.write(equipment.getText() + ', Arcana
')

I have tried:

f2.write(equipment.getText().encode('ascii', errors='ignore'))

and

f2.write(equipment.encode('ascii', errors='ignore').getText())

As well as some other things I am ashamed to post. Such as running it through the file that BeautifulSoup would later read from, but that just throws a different error. Thanks again for helping.

full traceback:

Traceback (most recent call last):
 File "", line 1, in 
  import D2soup1
 File "D2soup1.py", line 86, in 
  test()
 File "D2soup1.py", line 30, in test
  f2.write(equipment.getText() + ', Immortal
')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 5:     ordinal not in range(128)

I am using string to parse out the box-shadow from the below html. I know it is probably not the best practice, but it was the only way I could think to grab it. Still new to BeautifulSoup.

Martijn Pieters · Accepted Answer

You are using str(equipment) without a codec; you are encoding the Tag object to ASCII.

Don't use str; get the text once as a unicode value. And use a mapping and a loop instead of so many if statements.

In this case, the style attribute is all you need to test against:

types = {
    '#b0c3d9': 'Common',
    '#5e98d9': 'Uncommon',
    '#4b69ff':'Rare',
    '#8847ff': 'Mythical',
    '#b28a33': 'Immortal',
    '#d32ce6': 'Legendary',
    '#eb4b4b': 'Ancient',
    '#ade55c': 'Arcana'
}

for equipment in eqs:
    style = equipment.div.attrs.get('style', '')
    textcontent = equipment.getText().encode('utf8')
    for key in types:
        if key in style:
            f2.write('{}, {}'.format(textcontent, types[key])

Most likely, however, those color codes are in an attribute on the equipment tag; look just in the tag value, or use a .find() call to narrow down your searches.

UnicodeDecodeError: 'ascii' codec can't encode character u'\u2019'

Answers (1)

Related Questions

UnicodeDecodeError: &#39;ascii&#39; codec can&#39;t encode character u&#39;\u2019&#39;

Answers (1)

Related Questions

UnicodeDecodeError: 'ascii' codec can't encode character u'\u2019'