tharun amireddy
tharun amireddy

Reputation: 21

Unable to scrape a Website using beautifulSoup

I tried to scrape a page using beautiful soup (bs4) , but i am facing a problem while scraping data, I had even mentioned headers as pointed out in this answer Stackoverflow Question This is my code

from bs4 import BeautifulSoup
import requests
headers = {
'Referer': 'hello',
 }
 r=requests.get
 ('https://www.doamin.com/bangalore/restaurants',headers=headers)
 print(r.status_code)

this is the error that i am getting

requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

and this

 raise RemoteDisconnected("Remote end closed connection without"
 http.client.RemoteDisconnected: Remote end closed connection without 
 response

I even tried using the useragents

import requests
url = 'https://www.example.com/bangalore/restaurants'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 
(KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.content)

But still getting the same error!

Can anyone help me out ?

Upvotes: 2

Views: 1437

Answers (2)

Bertrand Martel
Bertrand Martel

Reputation: 45473

I guess the server is checking more thoroughly the user agent string by checking a list of valid Chrome version (if you specify a Chrome browser in user agent). The version you specified (41.0.2228) is not listed among Chrome version history. Use for instance 41.0.2272 :

import requests
url = 'https://www.example.com/bangalore/restaurants'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 
(KHTML, like Gecko) Chrome/41.0.2272.0 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.content)

Upvotes: 1

Abdul Qoyyuum
Abdul Qoyyuum

Reputation: 73

It is most likely that Zomato (and many other data collecting websites) have implemented measures to block data scrapers or data miners. Just use their API instead: https://developers.zomato.com/api

Upvotes: 0

Related Questions