Y.Su
Y.Su

Reputation: 406

Python requests encoding issues

I use python requests to make get request to this url. Here is the code snippet.

url = 'http://213.139.159.46/prj-wwvauskunft/projects/gus/daten/index.jsp?'
params = {'id': 2619521210}

response = requests.get(
    url,
    params=params
)

print(response.status_code)

text = response.text
content = response.content

I run the same code in Python2.7 and Python3.6

When I compare text variable between the two versions, they are different. But content between the two versions are the same. I am confused as to why content is the same but text are different. Shouldn't the text be the same as well if they are using the same encoding to encode text to content?

I used chardet to detect the encoding of content, both versions ended with ISO-8859-1. What's could be the possible reason for them not to use utf-8. Is it just a preference?

Also, when I do:

content.replace('span', '')

In Python2, it works. In Python3, it would throw the this error. TypeError: a bytes-like object is required, not 'str'(Using b'span' and b'' would solve the probelm)

But when I do:

text.replace('span', '')

Both version works. Why is that?

Upvotes: 0

Views: 127

Answers (1)

JosefZ
JosefZ

Reputation: 30153

There is no guaranty for Python 2 and Python 3 compatibility (neither backward nor forward). Read e.g. Python 2 vs Python 3: Key Differences. For instance, if your script was modified (add following code snippet to the end):

print('type(text)   ', type(text))
print('type(content)', type(content))

Output:

py -2 D:\Python\SO3\61954902.py
200
('type(text)   ', <type 'unicode'>)
('type(content)', <type 'str'>)
py -3 D:\Python\SO3\61954902.py
200
type(text)    <class 'str'>
type(content) <class 'bytes'>

For the sake of completeness, the script is as follows:

type D:\Python\SO3\61954902.py
import requests
url = 'http://213.139.159.46/prj-wwvauskunft/projects/gus/daten/index.jsp?'
params = {'id': 2619521210}

response = requests.get(
    url,
    params=params
)

print(response.status_code)

text = response.text
content = response.content
print('type(text)   ', type(text))
print('type(content)', type(content))

Upvotes: 0

Related Questions