kibaya
kibaya

Reputation: 93

How to get inner text value of an HTML tag with BeautifulSoup bs4?

When using BeautifulSoup bs4, how to get text from inside a HTML tag? When I run this line:

oname = soup.find("title")

I get the title tag like this:

<title>page name</title>

and now I want to get only the inner text of it, page name, without tags. How to do that?

Upvotes: 9

Views: 18218

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180532

Use .text to get the text from the tag.

oname = soup.find("title")
oname.text

Or just soup.title.text

In [4]: from bs4 import BeautifulSoup    
In [5]: import  requests
In [6]: r = requests.get("http://stackoverflow.com/questions/27934387/how-to-retrieve-information-inside-a-tag-with-python/27934403#27934387")    
In [7]: BeautifulSoup(r.content).title.text
Out[7]: u'html - How to Retrieve information inside a tag with python - Stack Overflow'

To open a file and use the text as the name simple use it as you would any other string:

with open(oname.text, 'w') as f

Upvotes: 13

Related Questions