Martin
Martin

Reputation: 129

Beautifulsoup convert string to ResultSet object of bs4.element module

Is there a way to save my ResultSet object of BeautifulSoup to a file, then read the file and be able to use commands such as find_all?

For example, my code is

import requests
from bs4 import BeautifulSoup

#scraping
website_link = 'https://stackoverflow.com/'
request1 = requests.get(website_link)
source1 = request1.content
soup1 = BeautifulSoup(source1, 'lxml')


#saving
savefilename = 'question.txt'
with open(savefilename, "w", encoding="utf-8") as f:
    f.write(str(soup1))
    f.close()

In the step f.write(str(soup1)), I am basically converting this ResultSet object of bs4.element into string for saving which is crucial, I have not found a way around this. Once it is converted into a string, is there a way to convert back to ResultSet object of BeautifulSoup that would allow me to use .find_all() and similar commands again?

Upvotes: 2

Views: 667

Answers (1)

MendelG
MendelG

Reputation: 20038

Just create another BeautifulSoup object:

import requests
from bs4 import BeautifulSoup

#scraping
website_link = 'https://stackoverflow.com/'
request1 = requests.get(website_link)
source1 = request1.content
soup1 = BeautifulSoup(source1, 'html.parser')


#saving
savefilename = 'question.txt'
with open(savefilename, "w", encoding="utf-8") as f:
    f.write(str(soup1))

# Open the saved file
with open(savefilename, "r", encoding="utf-8") as f:
    soup2 = BeautifulSoup(str(f.readlines()), "html.parser")
    
>>> print(type(soup2))
class 'bs4.BeautifulSoup'>

Upvotes: 2

Related Questions