guilemon
guilemon

Reputation: 140

TypeError while doing replace() on encoded BeautifulSoup result in Python

Trying to encode the text output received after parsing HTML data through BeautifulSoup library in Python 3. Getting following error:

----> gmtext.encode('ascii', errors='replace').replace("?", "")

TypeError: a bytes-like object is required, not 'str'

Here is the code implementation:

import urllib.request as urllib2
from bs4 import BeautifulSoup

articleURL = "http://digimon.wikia.com/wiki/Guilmon"

page = urllib2.urlopen(articleURL).read().decode('utf8', 'ignore')
soup = BeautifulSoup(page, 'lxml')
gmtext = soup.find('p').text

gmtext.encode('ascii', errors='replace').replace("?", "")

So far, all answers I found regarding this error have been about some sort of file open error.

Upvotes: 0

Views: 262

Answers (2)

Totoro
Totoro

Reputation: 887

you can do replace with bytes (using b before the string) like:

gmtext.encode('ascii', errors='replace').replace(b"?", b"")

Upvotes: 1

Omar Einea
Omar Einea

Reputation: 2524

.replace() is a string function, but you're calling it after calling .encode(),
which returns "a bytes-like object" that you can't call .replace() on.

If you want to, you can do replacement before encoding like so:

gmtext.replace("?", "").encode('ascii', errors='replace')

Then it'll work.

Upvotes: 1

Related Questions