Adam_G
Adam_G

Reputation: 7879

File Create/Write Issue In Python

I'm trying to create and write to a file. I have the following code:

from urllib2 import urlopen

def crawler(seed_url):
    to_crawl = [seed_url]
    crawled=[]
    while to_crawl:
        page = to_crawl.pop()
        page_source = urlopen(page)
        s = page_source.read()
        with open(str(page)+".txt","a+") as f:
            f.write(s)
            f.close()
    return crawled

if __name__ == "__main__":
    crawler('http://www.yelp.com/')

However, it returns the error:

Traceback (most recent call last):
  File "/Users/adamg/PycharmProjects/NLP-HW1/scrape-test.py", line 29, in <module>
    crawler('http://www.yelp.com/')
  File "/Users/adamg/PycharmProjects/NLP-HW1/scrape-test.py", line 14, in crawler
    with open("./"+str(page)+".txt","a+") as f:
IOError: [Errno 2] No such file or directory: 'http://www.yelp.com/.txt'

I thought that open(file,"a+") is supposed to create and write. What am I doing wrong?

Upvotes: 0

Views: 43

Answers (1)

khampson
khampson

Reputation: 15336

If you want to use the URL as the basis for the directory, you should encode the URL. That way, slashes (among other characters) will be converted to character sequences which won't interfere with the file system/shell.

The urllib library can help with this.

So, for example:

>>> import urllib
>>> urllib.quote_plus('http://www.yelp.com/')
'http%3A%2F%2Fwww.yelp.com%2F'

Upvotes: 5

Related Questions