sshussain270
sshussain270

Reputation: 1865

How to set the page_source of selenium yourself in python3?

I am writing an application using selenium. I know that you can use the webdriver.Firefox's get method to get the webpage like this:

    driver = webdriver.Firefox(executable_path=r'geckodriver')
    driver.get('file://' + os.path.dirname(os.path.abspath(__file__)) + '/index.html')
    driver.page_source # get the source

But instead of opening a webpage and and getting source from there, I want to provide the source myself like this:

    driver.page_source = '<body><h1>Hello</h1></body>'

And then be able to perform the normal selenium operations, for example:

    driver.find_element_by_tag_name('<h1>')

But since Firefox.page_source is a @property i can't set it manually. Does anyone know a work around that? Any suggestions will be highly appreciated.

Upvotes: 0

Views: 228

Answers (2)

ewwink
ewwink

Reputation: 19154

you can open it with Data URLs, it prefixed with the data: scheme

htmlString = '<body><h1>Hello</h1></body>'
driver.get("data:text/html;charset=utf-8," + htmlString);
h1 = driver.find_element_by_tag_name('h1')
print(h1.text)

Length limitations: 65535 characters

or without length limitation you can append the string using javascript method execute_script()

htmlString = '<html><body></body></html>'
driver.get("data:text/html;charset=utf-8," + htmlString);
largeHTMLString = '<h1>Hello</h1>'
driver.execute_script('document.body.innerHTML=arguments[0]', largeHTMLString)
h1 = driver.find_element_by_tag_name('h1')
print(h1.text)

Upvotes: 4

Daniel Scott
Daniel Scott

Reputation: 985

If you don't mind parsing with beautiful soup, this is how I would handle that problem:

from bs4 import BeautifulSoup

# Define the code
page_source = '<body><h1>Hello</h1></body>'

# Parse it using Beautiful Soup
soup = BeautifulSoup(page_source , 'lxml')

# Search for the result by the tag name
table = soup.findAll('name')

Hope that helps.

Upvotes: 0

Related Questions