srijan srivastav
srijan srivastav

Reputation: 33

How to get the text inside a span tag which is inside another tag using beautifulsoup?

How do I get the value of all the tags that have class="no-wrap text-right circulating-supply"? What I used was:

text=[ ]

text=(soup.find_all(class_="no-wrap text-right circulating-supply"))

Output of text[0]:

'\n\n17,210,662\nBTC\n'

I just want to extract the numeric value.

Example of one instance:

<td class="no-wrap text-right circulating-supply" data-sort="17210662.0">
            <span data-supply="17210662.0">
             <span data-supply-container="">
              17,210,662
             </span>
             <span class="hidden-xs">
              BTC
             </span>
            </span>
           </td>

Thanks.

Upvotes: 3

Views: 80

Answers (2)

Andersson
Andersson

Reputation: 52685

In case all elements have similar HTML structure try below to get required output:

texts = [node.text.strip().split('\n')[0] for node in soup.find_all(class_="no-wrap text-right circulating-supply")]

Upvotes: 2

Madhan Varadhodiyil
Madhan Varadhodiyil

Reputation: 2116

This might look like an overkill , You could use use regex to extract numbers

from bs4 import BeautifulSoup
html = """<td class="no-wrap text-right circulating-supply" data-sort="17210662.0">
            <span data-supply="17210662.0">
            <span data-supply-container="">
            17,210,662
            </span>
            <span class="hidden-xs">
            BTC
            </span>
            </span>
        </td>"""
import re
soup = BeautifulSoup(html,'html.parser')
coin_value =  [re.findall('(\d+)', node.text.replace(',','')) for node in soup.find_all(class_="no-wrap text-right circulating-supply")]
print coin_value

prints

[[u'17210662']]

Upvotes: 1

Related Questions