murad
murad

Reputation: 5

Finding the rating score of a URL

I am trying to create a dataframe consisting reviews on 20 banks and in the following code I am trying to get the 20 customers rating score value but finding it difficult as I am new BeautifulSoup and Webscraping.

import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.bankbazaar.com/reviews.html'
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text,'html.parser')


 Rating = []
rat_elem = soup.find_all('span')
for rate in rat_elem:
    Rating.append(rate.find_all('div').get('value')) 

 print(Rating)

Upvotes: 0

Views: 183

Answers (2)

Remy J
Remy J

Reputation: 729

import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.bankbazaar.com/reviews.html'
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text,'html.parser')

# Find all the span elements where the "itemprop" attribute is "ratingvalue". 
Rating = [item.text for item in soup.find_all('span', attrs={"itemprop":"ratingvalue"})]


print(Rating)
# The output
# ['4.0', '5.0', '5.0', '5.0', '4.0', '4.0', '5.0', '5.0', '5.0', '5.0', '4.0', '5.0', '5.0', '5.0', '5.0', '4.0', '4.5', '4.0', '4.0', '4.0']

BeautifulSoup keyword arguments

Upvotes: 0

facelessuser
facelessuser

Reputation: 1734

I prefer using CSS selectors, so you should be able to target all the spans by targeting the ones with the itemprop attribute set to ratingvalue.

import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.bankbazaar.com/reviews.html'
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text,'html.parser')

Rating = []
for rate in soup.select('span[itemprop=ratingvalue]'):
    Rating.append(rate.get_text()) 

print(Rating)

Relevant output

['4.0', '5.0', '5.0', '5.0', '4.0', '4.0', '5.0', '5.0', '5.0', '5.0', '4.0', '5.0', '5.0', '5.0', '5.0', '4.0', '4.5', '4.0', '4.0', '4.0']  

EDIT: add relevant output

Upvotes: 2

Related Questions