jim jarnac
jim jarnac

Reputation: 5152

Python POST request

I m using python requests to search the following site: https://www.investing.com/ for the terms "Durable Goods Orders US"

I check in the "Network" tab of the inspect panel, and it seems it is simply done with the following form: 'quotes_search_text':'Durable Goods Orders US'

So I tried with python:

URL = 'https://www.investing.com/'
data = {'quotes_search_text':'Durable Goods Orders US'}
resp = requests.post(URL, data=data, headers={ 'User-Agent': 'Mozilla/5.0', 'X-Requested-With': 'XMLHttpRequest'})

However this doesnt return the result that i can see while doing it manually. All the search results should have "gs-title" as a class attribute (as per the page inspection) but when I do:

soup = BeautifulSoup(resp.text, 'html.parser')
soup.select(".gs-title")

I see no results... Is there some aspect of POST request that I am not taking into account? (im a complete noob here)

Upvotes: 0

Views: 373

Answers (1)

double_j
double_j

Reputation: 1706

After going over this in detail in the chat, there are many changes. In order to retrieve the information your looking for, you need to run the JS that's being run on their end. You can change the query variable to whatever you want.

import requests
import json
from urllib.parse import quote_plus

URL = 'https://www.googleapis.com/customsearch/v1element'

query = 'Durable Goods Orders US'
query_formatted = quote_plus(query)

data = {
    'key':'AIzaSyCVAXiUzRYsML1Pv6RwSG1gunmMikTzQqY',
    'num':10,
    'hl':'en',
    'prettyPrint':'true',
    'source':'gcsc',
    'gss':'.com',
    'cx':'015447872197439536574:fy9sb1kxnp8',
    'q':query_formatted,
    'googlehost':'www.google.com'
}
headers = {
    'User-Agent':'Mozilla/5.0',
    'Referer':'https://www.investing.com/search?q=' + query_formatted,
}
resp = requests.get(URL, params=data, headers=headers)

j = json.loads(resp.text)
# print(resp.text)
for r in j['results']:
    print(r['title'], r['url'])

Upvotes: 1

Related Questions