Reputation: 532
I am having an issue with submitting an HTTP Post request. My purpose of this program is to scrape the lyrics off a website, and then use that string in a text summarizer. I am having an issue submitting the POST request on the summarizer's website. Currently with the code below, it does not submit request. It just returns the page. I think it may be due to the content-type being different, but I am not sure.
My code:
def summarize(lyrics):
url = 'http://www.freesummarizer.com'
values = {'text' : lyrics,
'maxsentences' : '1',
'maxtopwords' : '40',
'email' : '[email protected]' }
headers = {'User-Agent' : 'Mozilla/5.0'}
cookies = {'_jsuid': '777245265', '_ga':'GA1.2.164138903.1423973625', '__smToken':'elPdHJINsP5LvAYhia6OAA68', '__smListBuilderShown':'true', '_first_pageview':'1', '_gat':'1', '_eventqueue':'%7B%22heatmap%22%3A%5B%7B%22type%22%3A%22heatmap%22%2C%22href%22%3A%22%252F%22%2C%22x%22%3A324%2C%22y%22%3A1800%2C%22w%22%3A640%7D%5D%2C%22events%22%3A%5B%5D%7D', 'PHPSESSID':'28b0843d49700e134530fbe32ea62923', '__smSmartbarShown':'true'}
r = requests.post(url, data=values, headers=headers)
print(r.text)
My Response:
'transfer-encoding': 'chunked'
'set-cookie': 'PHPSESSID=1f10ec11e6f9040cbb5a81e16bfcdf7f; path=/',
'expires': 'Thu, 19 Nov 1981 08:52:00 GMT'
'keep-alive': 'timeout=5, max=100'
'server': 'Apache'
'connection': 'Keep-Alive'
'pragma': 'no-cache'
'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0'
'date': 'Fri, 27 Feb 2015 18:38:41 GMT'
'content-type': 'text/html'
A successful response on this website:
Host: freesummarizer.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://freesummarizer.com/
Cookie: _jsuid=777245265; _ga=GA1.2.164138903.1423973625; __smToken=elPdHJINsP5LvAYhia6OAA68; __smListBuilderShown=true; _first_pageview=1; _gat=1; _eventqueue=%7B%22heatmap%22%3A%5B%7B%22type%22%3A%22heatmap%22%2C%22href%22%3A%22%252F%22%2C%22x%22%3A324%2C%22y%22%3A1800%2C%22w%22%3A640%7D%5D%2C%22events%22%3A%5B%5D%7D; PHPSESSID=28b0843d49700e134530fbe32ea62923; __smSmartbarShown=true
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 6044
Upvotes: 0
Views: 101
Reputation: 710
Everything seems to be working just fine with requests
.
But I think the issue here is that you are using the wrong tool for the job.
The tool I believe you are looking for is Selenium.
Selenium automates browsers. That's it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes, but is certainly not limited to just that. Boring web-based administration tasks can (and should!) also be automated as well.
You should absolutely take a look it this tool.
Upvotes: 1