Web Scraping does not give the desired results

Question

I am trying to scrape some data from a website and the HTML code would look like as follows.


      Also Known As
    
          KOH Prep
          Fungal Smear, Culture, Antigen and Antibody Tests
          Mycology Tests
          Fungal Molecular Tests
          Potassium Hydroxide Preparation
          Calcofluor White Stain

The output what I want to get is OH Prep, Fungal Smear, Culture, Antigen and Antibody Tests, Mycology Tests, Fungal Molecular Tests...

But I don't get any output. My code us as follows.

def get_similar_names(sub_url):
    response = requests.get(sub_url)
    soup = BeautifulSoup(response.content, 'html.parser')
    if(soup.find('div', class_='field-label')!= None):
        other_names = [
            tag.next.next.get_text(strip=True, separator='|').split('|')
            for tag in soup.find('div', class_='field-label')
        ]
        return (other_names[0])
    else:
        return None

The actual link for the web page is this

HedgeHog · Accepted Answer

There are different approaches to get the names.

#1 - Get all names joined as a string as you expected output:

soup.select_one('div.field-items').get_text(',',strip=True)

Output -> KOH Prep,Fungal Smear, Culture, Antigen and Antibody Tests,Mycology Tests,Fungal Molecular Tests,Potassium Hydroxide Preparation,Calcofluor White Stain

#2 - Get all namesas a list:

[name.get_text() for name in soup.select('div.field-items > div')]

Output -> ['KOH Prep','Fungal Smear, Culture, Antigen and Antibody Tests','Mycology Tests','Fungal Molecular Tests','Potassium Hydroxide Preparation','Calcofluor White Stain']

#3 _ Get only the first name as in your code:

soup.select_one('div.field-items > div').get_text()

Output -> KOH Prep

Example

def get_similar_names(sub_url):
    response = requests.get(sub_url)
    soup = BeautifulSoup(response.content, 'html.parser')
    other_names = soup.select_one('div.field-items').get_text(',',strip=True)

    return other_names

Output

KOH Prep,Fungal Smear, Culture, Antigen and Antibody Tests,Mycology Tests,Fungal Molecular Tests,Potassium Hydroxide Preparation,Calcofluor White Stain

Web Scraping does not give the desired results

Answers (2)

Related Questions