How to scrape data from a website with same div class names with beautifulsoup?

Question

I am a beginner in python and web scraping, I have been scraping data and images successfully from 3 months and just got my first freelance. But this time I am finding hard as the data I am going after is having same div class name as others and I can't figure out how can I possibly try to obtain them specifically.

The Html parsed is as below


  
    
      
        Country
      
      
        United States
      
    
    
      
        Eye color
      
      
        blue
      
    
    
      
        Hair color
      
      
        blonde
      
    
    
      
        Height
      
      
        173.0 cm (5'8")
      
    
    
      
        Weight
      
      
        58 kg (128 lbs)
      
    
    
      
        BMI
      
      
        19.0 (normal)
      
    
    
      
      


            Add to favorites
          
      
            Is favorite

I am trying to get country, height, weight, hair color but as it can be seen all of them have the same div class="gr-6". With the code below I get the html but how do I scrape specifically the above data from it?

import requests
from bs4 import BeautifulSoup

url = 'https://egeniotik.com/en/funnystar/ann-ann'

response = requests.get(url)

soup = BeautifulSoup(response.text,'html.parser')
tags = soup.find_all("div", attrs={'class': 'stage-star-main-aside'})
tagsec= tags.find_all("li", attrs={'class': 'row is-copy is-bold'})

On the tagsec line i get the following error

ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

Barmar · Accepted Answer

Loop through the rows. The attribute name is in the first DIV, the value is in the second DIV.

rows = soup.select(".stage-star-main-aside li.row")
for row in rows:
    divs = row.find_all("div", class_="gr-6")
    attr_name = divs[0].get_text().strip()
    attr_value = divs[1].get_text().strip()
    print(f"{attr_name} = {attr_value}")

How to scrape data from a website with same div class names with beautifulsoup?

Answers (1)

Related Questions