ppxx
ppxx

Reputation: 198

Scraping URL from a Javascript loaded webpage

I am trying to scrape all the href of the ads posted on IMMOWEB under this link. The URLs are loaded by Javascript. I am using HTMLSession but couldn't obtain my results. Here is my code:


url = 'https://www.immoweb.be/en/search/apartment/for-sale?countries=BE&isNewlyBuilt=false&maxBedroomCount=3&maxPrice=200000&maxSurface=130&minBedroomCount=1&minPrice=100000&minSurface=65&postalCodes=2000,2018,2060,2140,2170,2600,2610,2627,2640,2650,2660,2845,2850,2900,2980&page=1&orderBy=newest&card=9267356'

sessions = HTMLSession()  
r = sessions.get(url)  
r.html.render()  
soup = BeautifulSoup(r.content, "html.parser")  
print (soup)  

Outputs required:

https://www.immoweb.be/en/classified/apartment/for-sale/antwerpen-merksem/2170/9268787?searchId=606f2c6d4c669  
https://www.immoweb.be/en/classified/apartment/for-sale/merksem/2170/9268390?searchId=606f2c6d4c669
'And other hrefs'

Upvotes: 1

Views: 81

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195438

The URLs are constructed dynamically via JavaScript. But you can load the ID of every property and construct it manually (following the URL will redirect to correct URL):

import re
import json
import requests
from html import unescape


url = "https://www.immoweb.be/en/search/apartment/for-sale?countries=BE&isNewlyBuilt=false&maxBedroomCount=3&maxPrice=200000&maxSurface=130&minBedroomCount=1&minPrice=100000&minSurface=65&postalCodes=2000,2018,2060,2140,2170,2600,2610,2627,2640,2650,2660,2845,2850,2900,2980&page=1&orderBy=newest&card=9267356"

html_doc = requests.get(url).text
data = json.loads(unescape(re.search(r":results='(.*?)'", html_doc).group(1)))

# uncomment to print all data:
# print(json.dumps(data, indent=4))

for p in data:
    print("https://www.immoweb.be/en/classified/{}".format(p["id"]))

Prints:

https://www.immoweb.be/en/classified/9268787
https://www.immoweb.be/en/classified/9268390
https://www.immoweb.be/en/classified/9268389
https://www.immoweb.be/en/classified/9268360
https://www.immoweb.be/en/classified/9267356
https://www.immoweb.be/en/classified/9266168
https://www.immoweb.be/en/classified/9264424
https://www.immoweb.be/en/classified/9264140
https://www.immoweb.be/en/classified/9264032
https://www.immoweb.be/en/classified/9263981
https://www.immoweb.be/en/classified/9263142
https://www.immoweb.be/en/classified/9261903
https://www.immoweb.be/en/classified/9261838
https://www.immoweb.be/en/classified/9261546
https://www.immoweb.be/en/classified/9261343
https://www.immoweb.be/en/classified/9261328
https://www.immoweb.be/en/classified/9261133
https://www.immoweb.be/en/classified/9260764
https://www.immoweb.be/en/classified/9260370
https://www.immoweb.be/en/classified/9214008
https://www.immoweb.be/en/classified/9259711
https://www.immoweb.be/en/classified/9258900
https://www.immoweb.be/en/classified/9258810
https://www.immoweb.be/en/classified/9258199
https://www.immoweb.be/en/classified/9258195
https://www.immoweb.be/en/classified/9258183
https://www.immoweb.be/en/classified/9258179
https://www.immoweb.be/en/classified/9215058
https://www.immoweb.be/en/classified/9256793
https://www.immoweb.be/en/classified/9256422

Upvotes: 2

Related Questions