Reputation: 15
I am trying to scrape & print all pages stored in a variable but for some reason, only the last page results get written. Below my code
from urllib.request import urlopen as oPen
from bs4 import BeautifulSoup as soup
import requests
for i in range(1,3):
myurl='http://www.imdb.com/search/title?genres=sci_fi&title_type=feature&sort=moviemeter,asc&page=' + str(i) + '&ref_=adv_nxt'
r = requests.get(myurl)
page_soup = soup(r.content,"html.parser")
uClient = oPen(myurl)
page_html = uClient.read()
uClient.close()
containers=page_soup.findAll("div",{"class":"lister-item mode-advanced"})
filename = "test.csv"
f = open(filename,"w")
headers="numbers\n"
f.write(headers)
for container in containers:
nr=container.findAll("span",{"class":"lister-item-index unbold text-primary"})
number=nr[0].text
x=(number + "," '\n')
f.write(x)
f.close()
Thanks in advance!
Upvotes: 0
Views: 139
Reputation: 510
You should open the file with a
argument, to append to the file.
Each time you open it with w
, it overwrites the file. Thus, only the last thing you write in it appears at the end.
f = open(filename,"a")
And you should open the file before the loop, and close it after. Thus, you don't spend all the time opening/closing it.
Upvotes: 1