user5621062
user5621062

Reputation:

Python requests: UnicodeEncodeError: 'charmap' codec can't encode character

I scraped a webpage (name changed in code here) as follows:

import requests
r = requests.get('https://www.samplewebpage.com')

Then I tried to write r.text to a file as follows:

f = open ('filename', 'w')
f.write(r.text)
f.close()

I get an error as:

UnicodeEncodeError: 'charmap' codec can't encode character '\u20b9' in position 158691: character maps to <undefined>

r.encoding shows UTF-8. How to resolve the above?

Have also tried the following: - few other random webpages and am able to run the code without any error for most. - instead of r.text used r.content.decode('utf-8', 'ignore') but same error as above

My environment/system specifications:

Suspecting console encoding mismatch as I read in another similar problem on this forum, I reconfirmed from that the Atom console is set to UTF-8, though I believe console encoding is not the problem here, as I want to write to a file.

Thanks

Upvotes: 2

Views: 2249

Answers (1)

pynexj
pynexj

Reputation: 20698

Try explicitly specifying the file's encoding:

f = open ('filename', 'w', encoding='utf8')
f.write(r.text)
f.close()

Upvotes: 3

Related Questions