Jeril
Jeril

Reputation: 8521

Python - Beautiful Soup OR condition in soup.find_all(....)

We were scrapping Amazon.in website to retrieve the price of any product. All products are having different value for 'id' attributes in 'span' tag such as;

 id = 'priceblock_ourprice',  id = 'priceblock_saleprice', and  id = 'priceblock_dealprice'.

Our task is to retrieve the price of the products using the find_all(..) method in Beautiful Soup. From our basic knowledge we were able to give only one parameter to the find_all(..) method as shown below:

m = soup1.find_all('span', {'id': 'priceblock_ourprice'})

Is there any way to give multiple parameters to the find_all(..) method using OR condition?

Following are the links with different values of same 'id' attribute:

Link 1

Link 2

Link 3

Thank you for your help!

Upvotes: 5

Views: 6616

Answers (4)

amir Shabani
amir Shabani

Reputation: 1

There is another way that I found it. You can pass a regex to an attribute.

import re
ids = ['priceblock_ourprice', 'priceblock_saleprice', 'priceblock_dealprice']
m = soup1.find_all('span', {'id': re.compile(ids.join("|"))})

Upvotes: 0

Human006
Human006

Reputation: 142

For those that wonder if they could avoid overcomplicating their script. simply passing a list inside the find statement works perfectly fine like so:

find_all(name='div', attrs={'class': 
[...
'one_sixth grey_block new-secondary-background result-item',
'one_sixth grey_block new-secondary-back', 
...]

Upvotes: 4

Taha Hamedani
Taha Hamedani

Reputation: 113

You can add your condition in your find_all parameters as follow :

td_tag_list = soup.find_all(
            lambda tag:tag.name == "span" and
            'id' in tag.attrs and tag.attrs['id'] == 'priceblock_ourprice')

Upvotes: 4

SQLnoob
SQLnoob

Reputation: 253

I haven't tested this but I believe you can pass a function as an argument to find_all() so you could try something like:

def check_id(tag):
    valid_ids = ['priceblock_ourprice','priceblock_saleprice','priceblock_dealprice']
    if tag.has_attr('id'):
        return tag['id'] in valid_ids
    else:
        return False

m = soup1.find_all(check_id)

Upvotes: 3

Related Questions