Reputation: 750
I am trying to download this website data https://www.ireps.gov.in/epsn/anonymSearch.do?advancedSearch=&searchParam=&searchOption=1&searchOptorOption=0&railwayZone=-1&dateFrom=01/08/2020&dateTo=12/09/2020&linkVal=department&selectDate=TENDER_OPENING_DATE&count=20406&pageNo=1
. I have the following code to do that.
import requests
import urllib.request
downloadUrl = "https://www.ireps.gov.in/epsn/anonymSearch.do?advancedSearch=&searchParam=&searchOption=1&searchOptorOption=0&railwayZone=-1&dateFrom=01/08/2020&dateTo=12/09/2020&linkVal=department&selectDate=TENDER_OPENING_DATE&count=20406&pageNo=1"
req = requests.get(downloadUrl, timeout=2.50)
page = urllib.request.urlopen(downloadUrl).read().decode('utf-8')
with open('page1.htm', 'wb') as f:
f.write(req.content)
with open("page1_1.htm",'w') as f:
f.write(page)
As you can see that I have used 2 modules to verify if I am not doing anything wrong in one method. But both of them returns data which is not complete. What this means is, if you open this link on a web browser you will see
But both the webpage that is downloaded using that script shows.
Note that I am not interested in CSS but only in HTML data of that website.
Certainly website is not sending back complete data
How do I get complete data from that website?
Upvotes: 1
Views: 93
Reputation: 195603
The data you see is loaded with POST request:
import requests
from bs4 import BeautifulSoup
data = {'searchOption': '1',
'searchOptorOption': '0',
'advancedSearch': '',
'organization': '01',
'workArea': '-1',
'changezone': '',
'railwayZone': '-1',
'division': '-1',
'unit': '-1',
'tenderStage': '-1',
'tenderType': '-1',
'bidding': '-1',
'selectDate': 'TENDER_OPENING_DATE',
'dateFrom': '12/09/2020',
'dateTo': '12/09/2020',
'submit': 'Show+Results',
'searchParam': ''
}
url = 'https://www.ireps.gov.in/epsn/anonymSearch.do'
soup = BeautifulSoup(requests.post(url, data=data).content, 'lxml')
for row in soup.select('table[bordercolor="#4D817A"] tr'):
print(row.get_text(strip=True, separator='|'))
Prints:
Deptt./Rly. Unit|Tender No|Tender Title|Status|Work Area|Due Date/Time|Due Days|Actions
GSD/RAIPUR/SOUTH EAST CENTRAL RLY|20201076|Modified Elastomeric Pad|Tender Box Open|Goods & Service|12/09/2020 10:30|LAPSED
RWSS/RAIPUR/SOUTH EAST CENTRAL RLY|22205156|Unequal angle, Size : 130 x 80 x 6 mm,|Published|Goods & Service|12/09/2020 10:30|LAPSED
STORES/WESTERN RLY|15201832|RUBBER PROFILE FOR WINDOW GUIDE AS PER|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61205807|SET OF METALIC FLEXIBLE|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61205713A|Mn. Steel liner for|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
CMS/KYN-MEDICAL/CENTRAL RLY|KYN202112510EPS-4|DENGUE IgG / IgM/ NSi TEST STRIPS OR CARD, WHO APPROVED ONLY|Published|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61205830|Jumbo" Medical oxygen cy|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61201208A|ACETAL HOMOPOLYMER GUIDE BUSH FOR AXLE|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
ELS/VALSAD/WESTERN RLY|56205170|Silicon based grease|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61201425|BLOW OUT COILWITH CONTACT SUPPORT SILVER|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61201422A|MOTOR SPIRIT|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
GSD/SABARMATI/WESTERN RLY|71201108|Liquefied Petroleum Gas|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
MEDICAL/SOUTH CENTRAL RLY|SCRLGD2033502530369|REAGENT FOR THE QUANTITATIVE ESTIMATION OF FERRITIN IN PATIE......|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
LLH/EASTERN RLY|52201386|PLATE|Published|Goods & Service|12/09/2020 11:00|LAPSED
LOWER PAREL/WESTERN RLY|52205558|S.S Dustbin for inside lavatory as per Drg. No. C/SK (MISC)-......|Published|Goods & Service|12/09/2020 11:00|LAPSED
LLH/EASTERN RLY|52201294B|WEARING PLATE|Published|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61205740A|TRANSPARENT OIL TRAP CH|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
LOWER PAREL/WESTERN RLY|52205507|HIGH STRENGTH SEGMENTED PATTERN UNIQUE STRUCTURAL SETUP AS P......|Published|Goods & Service|12/09/2020 11:00|LAPSED
LLH/EASTERN RLY|52201379|PIPE 20 MM|Published|Goods & Service|12/09/2020 11:00|LAPSED
LLH/EASTERN RLY|52205672|Powder (Electrostatic Painting)|Published|Goods & Service|12/09/2020 11:00|LAPSED
DAHOD/WESTERN RLY|61205511B|Welding protective glass Din-13|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
LLH/EASTERN RLY|52201300A|EMERY CLOTH|Published|Goods & Service|12/09/2020 11:00|LAPSED
ADI DIVISION-MEDICAL/WESTERN RLY|ADILP23019R|Mycophenolate Mofetil 500mg Tab|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
STORES/CRJ/CLW|92205412A|Injection Infliximab (100mg),|Tender Box Open|Goods & Service|12/09/2020 11:00|LAPSED
ANVT-CHG-STORES/NORTHERN RLY|45205108|Procurement of Switch Plate Assembly for LHB EOG GS Coaches.|Published|Goods & Service|12/09/2020 11:00|LAPSED
Upvotes: 2