Reputation: 458
So i have to scrape the least deal of hotel from this metasearch engine. But unable to to do that. All i'm getting is empty list while i'm finding elements with classes. Though request is fetching the correct html that i want. I do not know what to do? Here is my code:
# -*- coding: utf-8 -*-
"""
Created on Sat Jul 09 13:30:55 2016
@author: sroy
"""
import requests
from bs4 import BeautifulSoup
url = "https://www.kayak.co.in/hotels/Kolkata,India-c44834/2016-07-09/2016-07-10/2guests"
headers = {
'Accept':"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
'Accept-Encoding':"gzip, deflate, sdch, br",
'Accept-Language':"en-US,en;q=0.8",
'Cache-Control':"max-age=0",
'Connection':"keep-alive",
'DNT':1,
'Host':"www.kayak.co.in",
'Referer':"https://www.kayak.co.in/hotels",
'Upgrade-Insecure-Requests':1,
'User-Agent':"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
}
req = requests.get(url, headers=headers)
soup = BeautifulSoup(req.text.encode('utf-8'))
hotel_name = soup.find_all(".title")
price_elems = soup.find_all(".price")
for hotel in hotel_name:
i=0
print hotel_name[i]
print price_elems[i]
i+=1
It is printing nothing. Don't know why. What's the prob
Upvotes: 1
Views: 660
Reputation: 473993
You are using CSS selectors, but passing them to find_all()
method instead of select()
:
hotel_name = soup.select(".title")
price_elems = soup.select(".price")
Though, I still think you would need a real browser since it is quite a dynamic site. In any case, make sure to study Terms of Use and stay on the legal side.
Upvotes: 2