Chai
Chai

Reputation: 11

Store Scraped data to a text file in Python

I am able to scrape data using Beautifulsoup and now looking to generate a file containing all the data from which I scraped using Beautiful Soup.

file = open("copy.txt", "w") 
data = soup.get_text()
data
file.write(soup.get_text()) 
file.close() 

I don't see all the tag and entire content in the text file. Any Idea on how to achieve it?

Upvotes: 1

Views: 4351

Answers (2)

kederrac
kederrac

Reputation: 17322

you can use:

with open("copy.txt", "w") as file:
    file.write(str(soup))

if you have a list of URLs that will be scraped and then you want to store each URL scraped in a different file, you can try:

my_urls = [url_1, url_2, ..., url_n]
for index, url in enumerate(my_urls):
    # .............
    # some code to scrape 
    with open(f"scraped_{index}.txt", "w") as file:
        file.write(str(soup))

Upvotes: 1

soyapencil
soyapencil

Reputation: 614

Quick solution:

You need to just convert the soup to string. Using a test site, in case others wish to follow:

from bs4 import BeautifulSoup as BS
import requests

r = requests.get("https://webscraper.io/test-sites/e-commerce/allinone")
soup = BS(r.content)

file = open("copy.txt", "w") 
file.write(str(soup))
file.close()

Slightly better solution:

It's better practice to use a context for your file IO (use with):

from bs4 import BeautifulSoup as BS
import requests

r = requests.get("https://webscraper.io/test-sites/e-commerce/allinone")
soup = BS(r.content)

with open("copy.txt", "w") as file:
    file.write(str(soup))

Upvotes: 0

Related Questions