Reputation: 41
How do i get my Beautiful soup output data to a text file?
Here's the code;
import urllib2
from bs4 import BeautifulSoup
url = urllib2.urlopen("http://link").read()
soup = BeautifulSoup(url)
file = open("parseddata.txt", "wb")
for line in soup.find_all('a', attrs={'class': 'book-title-link'}):
print (line.get('href'))
file.write(line.get('href'))
file.flush()
file.close()
Upvotes: 3
Views: 6286
Reputation: 369124
file.close
should be called once (after the for
loop):
import urllib2
from bs4 import BeautifulSoup
url = urllib2.urlopen("http://link").read()
soup = BeautifulSoup(url)
file = open("parseddata.txt", "wb")
for line in soup.find_all('a', attrs={'class': 'book-title-link'}):
href = line.get('href')
print href
if href:
file.write(href + '\n')
file.close()
UPDATE you can use href=True
to avoid if
statement. In addition to it, using with
statement, you don't need to close the file object manually:
import urllib2
from bs4 import BeautifulSoup
content = urllib2.urlopen("http://link").read()
soup = BeautifulSoup(content)
with open('parseddata.txt', 'wb') as f:
for a in soup.find_all('a', attrs={'class': 'book-title-link'}, href=True):
print a['href']
f.write(a['href'] + '\n')
Upvotes: 3
Reputation: 21
I just do this:
with open('./output/' + filename + '.html', 'w+') as f:
f.write(temp.prettify("utf-8"))
temp is the html that is prased by beautifulsoup.
Upvotes: 0