Reputation: 457
I'm trying to see if I can pull data using the code below. For some reason, the beautifulsoup printout doesn't contain the data I see. I'm wondering where I've gone wrong. I've been trying different kind of headers, which is where I think my problem is but I may be wrong. For example I'm unable to find the following path when I inspect the page on the browser: <div class="textbold font-medium ng-binding">$25,000</div>
import urllib2
from bs4 import BeautifulSoup
url='https://www.prosper.com/listings#/detail/4964721'
hdr = {'Accept': 'text/html,application/xhtml+xml,*/*',"user-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36"}
req=urllib2.Request(url,headers=hdr)
html = urllib2.urlopen(req)
soup=BeautifulSoup(html,"lxml")
print soup
Upvotes: 0
Views: 52
Reputation: 4029
url reponse has to be read like this
html = urllib2.urlopen(req).read()
Based on your example, it appears you are looking for rendered html.
In your case, an ajax request is made to
Response to this ajax request is a json which gets rendered on to the UI.
Upvotes: 3