smatthewenglish
smatthewenglish

Reputation: 2889

generalize python script to run on all files in a directory

I have the following python script:

with open('ein.csv', 'r') as istr:
    with open('aus.csv', 'w') as ostr:
        for line in istr:
            line = line.rstrip('\n') + ',1'
            print(line, file=ostr)

how could this be generalized to run on all the files in a directory and output a seperate file for each one?

maybe have a function like this:

for phyle in list_files(.): 
    with open(phyle, 'r') as istr:
        with open('ausput_xyz.csv', 'w') as ostr:
            for line in istr:
                line = line.rstrip('\n') + ',1'
                print(line, file=ostr)

def list_files(path):
    # returns a list of names (with extension, without full path) of all files 
    # in folder path
    files = []
    for name in os.listdir(path):
        if os.path.isfile(os.path.join(path, name)):
            files.append(name)
    return files 

Upvotes: 0

Views: 708

Answers (2)

Kasravnd
Kasravnd

Reputation: 107287

First off, as a more pythonic way for dealing with csv files you better to use csv module and use with statement for opening the files which will close the file object automatically at the end of the block. And use os.walk() function to iterate over the files and directories of a specific path:

import csv
import os

for path_name, dirs, files in os.walk('relative_path'):
    for file_name in files:
        with open(file_name) as inp,open('{}.csv'.format(file_name),'wb') as out:
             spamwriter = csv.writer(out, delimiter=',')
             for line in inp:
                 spamwriter.writerow(line) # or line.split() with a specific delimiter

Note that if your script is not in a same path with your files directory you can add the path to the leading of your file name while you want to open them.

with open(path_name+'/'+file_name),...

Upvotes: 2

kfx
kfx

Reputation: 8537

Just put the code in a function and call it:

def process(infilename):
    outfilename = os.path.splitext(infilename)[0] + "-out.csv" 
    with open(infilename, 'r') as istr:
        with open(outfilename, 'w') as ostr:
            for line in istr:
                line = line.rstrip('\n') + ',1'
                print(line, file=ostr)

def process_files(path):
    for name in os.listdir(path):
        if os.path.isfile(os.path.join(path, name)):
            process(name)

In a directory with input files "abc.csv", "xyz.csv" this code will create output files named "abc-out.csv" and "xyz-out.csv".

Note that os.listdir(path) is called just once during the execution, so the list of files to process will not include the newly created output files.

Upvotes: 3

Related Questions