max10
max10

Reputation: 41

How do i get my Beautiful soup output data to a text file?

How do i get my Beautiful soup output data to a text file?
Here's the code;

import urllib2

from bs4 import BeautifulSoup

url = urllib2.urlopen("http://link").read()

soup = BeautifulSoup(url)

file = open("parseddata.txt", "wb")

for line in soup.find_all('a', attrs={'class': 'book-title-link'}):

 print (line.get('href'))

 file.write(line.get('href'))

 file.flush()

 file.close()

Upvotes: 3

Views: 6286

Answers (2)

falsetru
falsetru

Reputation: 369124

file.close should be called once (after the for loop):

import urllib2
from bs4 import BeautifulSoup

url = urllib2.urlopen("http://link").read()
soup = BeautifulSoup(url)
file = open("parseddata.txt", "wb")
for line in soup.find_all('a', attrs={'class': 'book-title-link'}):
    href = line.get('href')
    print href
    if href:
        file.write(href + '\n')
file.close()

UPDATE you can use href=True to avoid if statement. In addition to it, using with statement, you don't need to close the file object manually:

import urllib2
from bs4 import BeautifulSoup


content = urllib2.urlopen("http://link").read()
soup = BeautifulSoup(content)

with open('parseddata.txt', 'wb') as f:
    for a in soup.find_all('a', attrs={'class': 'book-title-link'}, href=True):
        print a['href']
        f.write(a['href'] + '\n')

Upvotes: 3

zhufree
zhufree

Reputation: 21

I just do this:

with open('./output/' + filename + '.html', 'w+') as f:
    f.write(temp.prettify("utf-8"))

temp is the html that is prased by beautifulsoup.

Upvotes: 0

Related Questions