clubby789
clubby789

Reputation: 2733

Saving files to a subdirectory

I've been working on a scraper to get large amounts of HTML and images from a website. I've got the scraper working, but the directory fills massively, making it hard to navigate. How would I go about saving it to a subdirectory? The part that saves the HTML:

t = open(str(current)+".html", 'w+')
t.write(b)
t.close()

And the part that saves the image:

urllib.request.urlretrieve(img, page+".gif")

Upvotes: 2

Views: 16141

Answers (1)

suroh
suroh

Reputation: 917

You're only showing us a portion of your code which is unhelpful, with that said writing to a subdirectory is simple but first requires the creation of one. For now, I can only give you a few basic examples because I don't know what the rest of your code looks like, hope something here helps!

def create_folder(self, path):
        try:
            if os.path.isdir(path):
                print("Error: The directory you're attempting to create already exists") # or just pass
            else:
                os.makedirs(path)
        except IOError as exception:
            raise IOError('%s: %s' % (path, exception.strerror))
        return None

or even easier

os.makedirs("C:\\Example Folder\\")

or in the case of Linux

os.makedirs('/home/' + os.getlogin() + '/Example Folder/')

Then just write to it like you normally would, as in just supply the path to the subdirectory.

def write(self, path, text):
        try:
            if os.path.isfile(path):
                return None # or print and error, or pass etc...
            else:
                with open(path, 'w') as outFile:
                    outFile.write(text)
        except IOError as exception:
            raise IOError("%s: %s" % (path, exception.strerror))

        return None

in this case, you'd put the path to your subdirectory in the "path" parameter and the variable containing the text in the "text" parameter. You can modify this function to append, write bytes etc..

Updated information addressing your comments

A really simple way to make small scale python programs "More" cross platform is to just do something like

if sys.platform == 'win32':
    print('This is windows')
elif sys.platform == 'linux2':
    print('This is some form of linux')

You can add that to check the os and then run your blocks based on the os :)

Yes you're correct that the above write function does overwrite the files, you can append the files (add new text without overwriting the existing text) by changing the 'w' flag to 'a' like so

def append(self, path, text):
        try:
            if os.path.isfile(path):
                with open(path, 'a') as outFile:
                    outFile.write(text)
        except IOError as exception:
            raise IOError('%s: %s' % (path, exception.strerror))
        return None    

Further updates:

You can remove "self" if you're not working with classes.

Based on your last comment which was " What do I put in self" I really highly suggest you abandon your project temporarily and first learn the basics of python... You can find tutorials all over including in the following places.

https://www.tutorialspoint.com/python/

https://docs.python.org/3/tutorial/

If you're using an older version you can simply change to whichever one you're using on the official site, I wish you the best of luck but unfortunately, I can't help you further without first knowing at least the basics, I'm sorry!

Update: This is coming after a long time but I felt obligated to add these to the answer because this post was recently viewed again.

os.mkdir('\path\to\dir')
# is also valid
# Python 3+ use the following 
if sys.platform.startswith('linux'):
    print('This is linux')
    #insert code here. We use .startswith('')
    #becuase the version number was depricated 
elif sys.platform.startswith('win'):
    print('This is windows') 

Upvotes: 6

Related Questions