Othmane Bazine
Othmane Bazine

Reputation: 1

get incomplete content with request.get()

each time i have a url like this in my code: https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T-shirt%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow%257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch%22%3A%20false%2C%20%22auto_correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T-shirt%22%2C%20%22redux_normalize_feed%22%3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C%20%22bookmarks%22%3A%20%5Bnull%5D%7D%2C%20%22context%22%3A%20null%7D&_=1729500369393

this contains my query result when searching in Pinterest(), and the result the link is an dictionary , but when I try to get the content with requests.get() the content will be incomplete and I lost a lot of images

url='https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T-shirt%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow%257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch%22%3A%20false%2C%20%22auto_correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T-shirt%22%2C%20%22redux_normalize_feed%22%3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C%20%22bookmarks%22%3A%20%5Bnull%5D%7D%2C%20%22context%22%3A%20null%7D&_=1729500369393'

response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON data
    data = json.loads(response.text)

when I compare the data variable content with the content in the browser, I find many different and missing data

I tried to increase the timeout=10 thinking that the problem was due to the large content in the link but the same problem and I also tried with other libraries like urllib.request and http.client but I faced the same problem, maybe a had an issues when use them

Upvotes: 0

Views: 32

Answers (1)

ih-isj
ih-isj

Reputation: 31

I think it's because the web service knows that you're fetching data via code and not from a browser.

Try to add headers to your request that mimic a browser, including user-agent and any relevant cookies. This will fool the web service to look like it's coming from a browser.

import requests

url = 'https://www.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Fq%3Dyellow%2520car%2520on%2520T-shirt%26rs%3Dtyped%26term_meta%5B%5D%3Dyellow%257Ctyped%26term_meta%5B%5D%3Dcar%257Ctyped%26term_meta%5B%5D%3Don%257Ctyped%26term_meta%5B%5D%3DT-shirt%257Ctyped&data=%7B%22options%22%3A%20%7B%22isPrefetch%22%3A%20false%2C%20%22auto_correction_disabled%22%3A%20false%2C%20%22query%22%3A%20%22yellow%20car%20on%20T-shirt%22%2C%20%22redux_normalize_feed%22%3A%20true%2C%20%22rs%22%3A%20%22typed%22%2C%20%22scope%22%3A%20%22pins%22%2C%20%22page_size%22%3A%2050%2C%20%22bookmarks%22%3A%20%5Bnull%5D%7D%2C%20%22context%22%3A%20null%7D&_=1729500369393'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept': 'application/json, text/javascript, */*; q=0.01',
    'Referer': 'https://www.pinterest.com/',
}

response = requests.get(url, headers=headers)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON data
    data = json.loads(response.text)

Upvotes: 0

Related Questions