AI52487963
AI52487963

Reputation: 179

Looping over filenames in python

I have a zillion files in a directory I want a script to run on. They all have a filename like: prefix_foo_123456_asdf_asdfasdf.csv. I know how to loop over files in a directory using a variable in the filename in shell but not python. Is there a corresponding way to do something like

$i=0

for $i<100

./process.py prefix_foo_$i_*

$i++

endloop

Upvotes: 2

Views: 5014

Answers (3)

Dmitry Zagorulkin
Dmitry Zagorulkin

Reputation: 8548

another way:

from os import walk

>>> for filename, subdirs, dirs in walk('/home'):
...     print (filename, subdirs, dirs)

output:

home/di/workspace/local2stream/mediaelement/.git/info [] ['exclude'] /home/di/workspace/local2stream/mediaelement/.git/logs ['refs'] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/logs/refs ['remotes', 'heads'] [] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/remotes ['origin'] [] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/remotes/origin [] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/heads [] ['master'] /home/di/workspace/local2stream/mediaelement/.git/objects ['info', 'pack'] [] /home/di/workspace/local2stream/mediaelement/.git/objects/info [] [] /home/di/workspace/local2stream/mediaelement/.git/objects/pack [] ['pack-a378eaa927a4825f049faf10bab35cf5d94545f1.idx', 'pack-a378eaa927a4825f049faf10bab35cf5d94545f1.pack'] /home/di/workspace/local2stream/mediaelement/.git/refs ['tags', 'remotes', 'heads'] [] /home/di/workspace/local2stream/mediaelement/.git/refs/tags [] [] /home/di/workspace/local2stream/mediaelement/.git/refs/remotes ['origin'] [] /home/di/workspace/local2stream/mediaelement/.git/refs/remotes/origin [] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/refs/heads [] ['master']

Upvotes: 2

mgilson
mgilson

Reputation: 310257

you can use glob.glob or glob.iglob to get a list/iterator of filenames.

e.g. if your directory has "file1.txt","file2.txt","file3.txt"

import glob
print (glob.glob('*.txt'))  #['file1.txt','file2.txt','file3.txt']

Although the list won't necessarily be sorted.

Your loop can be written as:

import subprocess
import glob
for i in range(100):
    files=glob.glob('prefix_foo_%d_*'%(i))
    subprocess.call(['./process.py']+files)

Of course, using subprocess in python to run another python program is probably not the best design...(you could probably import the stuff you need from the other module and run it without spawning another process)

Upvotes: 4

Anthon
Anthon

Reputation: 76912

use the standard library glob. Assuming the functionality of process.py is in the function process_one_file:

from glob import glob
from process import process_one_file

for i in range(100):
    process_one_file(glob('prefix_foo_{}_*'.format(i)))

Upvotes: 2

Related Questions