William
William

Reputation: 4036

How to use Python Selenium get partial html source?

When I use driver.page_source I will get full source page, is there any way that I can get specific part of the html code.

from selenium import webdriver
chrome_options = webdriver.ChromeOptions ()

from selenium.webdriver.common.keys import Keys
    
    driver = webdriver.Chrome (executable_path="/selenium/chromedriver", options=chrome_options)
    driver.get("https://news.creaders.net/us/2021/01/27/2315313.html")
            
    content = driver.page_source

Then I will receive the whole page html.

But I only need the html that inside the : <div id="newsContent"> </div>

<div id="newsContent">

<p></p><p>cotent</p><p style="text-align: center;"><img src="https://pub.creaders.net/upload_files/image/202101/20210127_16117914118079.png" title="20210127_16117914118079.png" alt="image.png"></p>

</div>

Upvotes: 0

Views: 208

Answers (1)

ziptron
ziptron

Reputation: 229

Try running your HTML output through the BeautifulSoup parser.

from bs4 import BeautifulSoup

soup = BeautifulSoup(html)
div = soup.find('div', id='newsContent')
print ''.join(map(str, div.contents))

Upvotes: 1

Related Questions