Reputation: 47
I am using BeautifulSoup
to scrape the data from the website https://oxygen.digiavidity.com/?fbclid=IwAR3d_HtQPWni0lyHOMQOdokZGg3J7acwYc80EOFX7g8XYHloC550R5BtO94 .
But if I select a particular district from the District
drop-down box to get all the Suppliers name(in bold)
and contacts
from the particular district, keeping the other two dropdown boxes as default then I'm not able to fetch the required data.
suppose I'm select the drop-down boxes as :
Here is my code:
import requests
from bs4 import BeautifulSoup
url = "https://oxygen.digiavidity.com/?
fbclid=IwAR3d_HtQPWni0lyHOMQOdokZGg3J7acwYc80EOFX7g8XYHloC550R5BtO94"
soup = BeautifulSoup(requests.get(url).content, "lxml")
x=soup.find_all('div',class_='list-group')
for val in x:
name=val.find('h5',class_='mb-1').text
contact=val.find('p').text
print(name)
print(contact)
Someone, please help me. Thanks in advance!
Upvotes: 0
Views: 104
Reputation: 9619
There is no need to scrape this website since the data is loaded fom an api. You can get the data with requests
and parse the json as a dictionary with response.json()
. Then you can load it in pandas
for example.
import requests
import pandas as pd
headers = {'User-Agent': 'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Trident/6.0; Touch)',
'Content-type': 'application/json; charset=UTF-8',
}
response = requests.post('https://oxygen.digiavidity.com/ViewData/All', headers=headers)
df = pd.DataFrame(response.json())
Result df.head()
:
_id | Ident | District | Area_Name | Supplier_Name | Supplier_Contact | Updated_date | Updated_Time | Fresh_Cylinder_Availability | Oxygen_Refilling | Additional_Information | Delivery_Range | SPOC | Availability_Status | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 60b91659c21655ec6eac3bf6 | 1 | Kolkata | Kolkata | Swarnabha Dey | 9038399847 | 3-June-2021 | 8:31 PM | Yes | No | photo identity proof and prescription required | All over West Bengal | Ranita | nan |
1 | 60b91659c21655ec6eac3bf7 | 2 | Bankura | Bankura | Shreyasi(Volunteer) | 7866855988 | 3-Jun-2021 | 12:57 PM | Yes | Yes | photo identity proof and prescription required | Bankura | Ranita | immediate refilling will be done only in town. Rest will take some time or contacts will be shared |
2 | 60b91659c21655ec6eac3bf8 | 3 | Bankura | Maliyaja, Bankura | Baishali Tiwari | 9831935524 | 20-May-2021 | 8:14:00 AM | No | No | nan | Bankura | Chirantan | Delivering cylinders only to hospitals |
3 | 60b91659c21655ec6eac3bf9 | 4 | Birbhum | Rampurhat | Deb Bikram Dutta, Tarun Dutta (Don't call before 10am) | 9434132232 | 3-Jun-2021 | 1:00:00 PM | Yes | Yes | Prescription and Aadhar card required | Rampurhat | Ranita | Both fresh cylinder and refilling available |
4 | 60b9165ac21655ec6eac3bfa | 5 | Birbhum | Bolpur | Ani | 7029177504 | 3-Jun-2021 | 13:03:00 | Yes | Yes | Whatsapp him the patient details to his number | Bolpur | Ranita | Both fresh cylinder and refilling available |
You can filter by district like this: df[df['District'] == 'Birbhum']
Upvotes: 1