pcymichael
pcymichael

Reputation: 95

How to get the text in <script>

A while ago I used the following code to get window._sharedData; but the same code just now has no way, what should I do

If I change script to div it can work but I need is use script

code.py

from bs4 import BeautifulSoup
html1 = '<h1><script>window._sharedData;</script></h1>'
soup = BeautifulSoup(html1)
print(soup.find('script').text)

Upvotes: 2

Views: 77

Answers (2)

Humayun Ahmad Rajib
Humayun Ahmad Rajib

Reputation: 1560

You should use BeautifulSoup(html1, 'lxml') instead of BeautifulSoup(html1). If Output is empty, you will use .string instead of .text. You can try it:

from bs4 import BeautifulSoup
html1 = '<h1><script>window._sharedData;</script></h1>'
soup = BeautifulSoup(html1, 'lxml')
print(soup.find('script').text)

or

print(soup.find('script').string)

Output will be:

window._sharedData;

Upvotes: 0

Viewed
Viewed

Reputation: 1413

Add html.parser or lxml and call .string instead .text

from bs4 import BeautifulSoup
html = '<h1><script>window._sharedData;</script></h1>'
soup = BeautifulSoup(html, 'html.parser')
print(soup.find('script').string)

Upvotes: 1

Related Questions