Austin Capobianco
Austin Capobianco

Reputation: 212

Make python save to a folder created in the directory of the py file being run

I'm trying to save a bunch of pages in a folder next to the py file that creates them. I'm on windows so when I try to make the trailing backslash for the file-path it makes a special character instead.

Here's what I'm talking about:

from bs4 import BeautifulSoup
import urllib2, urllib
import csv
import requests
from os.path import expanduser

print "yes"
with open('intjpages.csv', 'rb') as csvfile:
    pagereader = csv.reader(open("intjpages.csv","rb"))
    i=0
    for row in pagereader:
        print row
        agentheader = {'User-Agent': 'Nerd'}
        request = urllib2.Request(row[0],headers=agentheader)
        url = urllib2.urlopen(request)       
        soup = BeautifulSoup(url)        
        for div in soup.findAll('div', {"class" : "side"}):
            div.extract()
        body = soup.find_all("div", { "class" : "md" })
        name = "page" + str(i) + ".html"
        path_to_file = "\cleanishdata\" 
        outfile = open(path_to_file + name, 'w')
        #outfile = open(name,'w')  #this works fine
        body=str(body)
        outfile.write(body)
        outfile.close()
        i+=1

I can save the files to the same folder that the .py file is in, but when I process the files using rapidminer it includes the program too. Also it would just be neater if I could save it in a directory.

I am surprised this hasn't already been answered on the entire internet.

EDIT: Thanks so much! I ended up using information from both of your answers. IDLE was making me use r'\string\' to concatenate the strings with the backslashes. I needed use the path_to_script technique of abamert to solve the problem of creating a new folder wherever the py file is. Thanks again! Here's the relevant coding changes:

            name = "page" + str(i) + ".txt"
        path_to_script_dir = os.path.dirname(os.path.abspath("links.py"))
        newpath = path_to_script_dir + r'\\' + 'cleanishdata'
        if not os.path.exists(newpath): os.makedirs(newpath)
        outfile = open(path_to_script_dir + r'\\cleanishdata\\' + name, 'w')
        body=str(body)
        outfile.write(body)
        outfile.close()
        i+=1

Upvotes: 0

Views: 1190

Answers (2)

JJ Geewax
JJ Geewax

Reputation: 10579

Are you sure sure you're escaping your backslashes properly?

The \" in your string "\cleanishdata\" is actually an escaped quote character (").

You probably want

r"\cleanishdata\"

or

"\\cleanishdata\\"

You probably also want to check out the os.path library, particular os.path.join and os.path.dirname.

For example, if your file is in C:\Base\myfile.py and you want to save files to C:\Base\cleanishdata\output.txt, you'd want:

os.path.join(
    os.path.dirname(os.path.abspath(sys.argv[0])),  # C:\Base\
    'cleanishdata',
    'output.txt')

Upvotes: 3

abarnert
abarnert

Reputation: 365747

A better solution than hardcoding the path to the .py file is to just ask Python for it:

import sys
import os

path_to_script = sys.argv[0]
path_to_script_dir = os.path.dirname(os.path.abspath(path_to_script))

Also, it's generally better to use os.path methods instead of string manipulation:

outfile = open(os.path.join(path_to_script_dir, name), 'w')

Besides making your program continue to work as expected even if you move it to a different location or install it on another machine or give it to a friend, getting rid of the hardcoded paths and the string-based path concatenation means you don't have to worry about backslashes anywhere, and this problem never arises in the first place.

Upvotes: 2

Related Questions