314mip
314mip

Reputation: 403

Get specific items from bs4.element

I have element with type bs4.element.Tag:

<div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 35id</span></div>

And I need to get "1003 : 11400" from this element. Please, how to do that?

Thank you

EDIT:

and how can I select individual elements ("1003 : 11400", ...) if I have more than one divs:

    <div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 35id</span></div>,
<div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 36id</span></div>,
<div class="table_v_nr">
    1007 : 11550

    <span class="table_v_time" title="13. min. 2. hr. 6. day.">Y 37id</span></div>,

...

Upvotes: 2

Views: 719

Answers (2)

Sushil
Sushil

Reputation: 5531

This should help you:

div = soup.find('div', class_ = "table_v_nr")
print(div.find_next(text=True).strip())

Full code:

from bs4 import BeautifulSoup

html = '''
<div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 35id</span></div>
'''
soup = BeautifulSoup(html,'html5lib')

div = soup.find('div', class_ = "table_v_nr")
print(div.find_next(text=True).strip())

Output:

1003 : 11400

Edit:

If you wanted to extract the text from multiple div tags, then you can try something like this:

from bs4 import BeautifulSoup

html = """
    <div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 35id</span></div>,
<div class="table_v_nr">
    1003 : 11400

    <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 36id</span></div>,
<div class="table_v_nr">
    1007 : 11550

    <span class="table_v_time" title="13. min. 2. hr. 6. day.">Y 37id</span></div>,
"""
soup = BeautifulSoup(html,'html5lib')

[print(div.find_next(text=True).strip()) for div in soup.find_all('div', class_ = "table_v_nr")]

Output:

1003 : 11400
1003 : 11400
1007 : 11550

Upvotes: 2

MendelG
MendelG

Reputation: 20038

Use .contents:

from bs4 import BeautifulSoup

html = """<div class="table_v_nr">
    1003 : 11400

   <span class="table_v_time" title="12. min. 2. hr. 6. day.">Y 35id</span></div>
"""
soup = BeautifulSoup(html,'html.parser')

div = soup.find('div', class_ = "table_v_nr").contents[0]
print(div.strip())

Output:

1003 : 11400

Edit you can use a CSS Selector:

from bs4 import BeautifulSoup

soup = BeautifulSoup(html,'html.parser')

for tag in soup.select (".table_v_nr:contains('1003')"):
    print(tag.next.strip())

Output:

1003 : 11400
1003 : 11400

Upvotes: 1

Related Questions