Area21
Area21

Reputation: 15

Python scraping retrieving only last page - newbie

I am trying to scrape & print all pages stored in a variable but for some reason, only the last page results get written. Below my code

from urllib.request import urlopen as oPen
from bs4 import BeautifulSoup as soup
import requests 



for i in range(1,3): 
        myurl='http://www.imdb.com/search/title?genres=sci_fi&title_type=feature&sort=moviemeter,asc&page=' + str(i) + '&ref_=adv_nxt'
        r = requests.get(myurl)
        page_soup = soup(r.content,"html.parser")
        uClient = oPen(myurl)
        page_html = uClient.read()
        uClient.close()



        containers=page_soup.findAll("div",{"class":"lister-item mode-advanced"})


        filename = "test.csv"
        f = open(filename,"w")
        headers="numbers\n"
        f.write(headers)

        for container in containers:


                       nr=container.findAll("span",{"class":"lister-item-index unbold text-primary"})
                       number=nr[0].text






                       x=(number + "," '\n')   
                       f.write(x)                                                            
        f.close()

Thanks in advance!

Upvotes: 0

Views: 139

Answers (1)

radar
radar

Reputation: 510

You should open the file with a argument, to append to the file. Each time you open it with w, it overwrites the file. Thus, only the last thing you write in it appears at the end.

f = open(filename,"a")

And you should open the file before the loop, and close it after. Thus, you don't spend all the time opening/closing it.

Upvotes: 1

Related Questions