Chris lee
Chris lee

Reputation: 3

Python Webscrape (using BeautifulSoup) question

I am trying to webscrape this site https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park to get the Land Size (sqm) in the overview table. Result should give me 40,608

However, I am unable to get the result I want. Here is my code:

#[Python] test webscrape on edgeprop
import gspread
import json
from oauth2client.service_account import ServiceAccountCredentials
from openpyxl.worksheet import worksheet
from requests.api import request
import requests
import time
from requests.models import Response
import scrapy
from bs4 import BeautifulSoup
from six import add_metaclass, class_types


query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  
resp = requests.get(query_string)   
soup = BeautifulSoup(resp.content,'html.parser')
print("soup is: ", query_string)

try:
    landsize = soup.find_all("h4",class_="detail-title__text")
    print("Landsize is: ", landsize)

except IndexError:
    pass

Upvotes: 0

Views: 155

Answers (1)

dimay
dimay

Reputation: 2804

Try this:

import json
import requests
from bs4 import BeautifulSoup

query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  

resp = requests.get(query_string) 
  
soup = BeautifulSoup(resp.content,'html.parser')

# get data with all info
data = soup.find("script", id="__NEXT_DATA__").text

# convert string to python dict
json_data = json.loads(data)

# get land_size from dict
print(json_data["props"]["pageProps"]["projectInfo"]["data"]["land_size"])

In html response you can find json which includes all information. enter image description here

Upvotes: 1

Related Questions