Reputation: 179
I have a zillion files in a directory I want a script to run on. They all have a filename like: prefix_foo_123456_asdf_asdfasdf.csv. I know how to loop over files in a directory using a variable in the filename in shell but not python. Is there a corresponding way to do something like
$i=0
for $i<100
./process.py prefix_foo_$i_*
$i++
endloop
Upvotes: 2
Views: 5014
Reputation: 8548
another way:
from os import walk
>>> for filename, subdirs, dirs in walk('/home'):
... print (filename, subdirs, dirs)
output:
home/di/workspace/local2stream/mediaelement/.git/info [] ['exclude'] /home/di/workspace/local2stream/mediaelement/.git/logs ['refs'] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/logs/refs ['remotes', 'heads'] [] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/remotes ['origin'] [] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/remotes/origin [] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/logs/refs/heads [] ['master'] /home/di/workspace/local2stream/mediaelement/.git/objects ['info', 'pack'] [] /home/di/workspace/local2stream/mediaelement/.git/objects/info [] [] /home/di/workspace/local2stream/mediaelement/.git/objects/pack [] ['pack-a378eaa927a4825f049faf10bab35cf5d94545f1.idx', 'pack-a378eaa927a4825f049faf10bab35cf5d94545f1.pack'] /home/di/workspace/local2stream/mediaelement/.git/refs ['tags', 'remotes', 'heads'] [] /home/di/workspace/local2stream/mediaelement/.git/refs/tags [] [] /home/di/workspace/local2stream/mediaelement/.git/refs/remotes ['origin'] [] /home/di/workspace/local2stream/mediaelement/.git/refs/remotes/origin [] ['HEAD'] /home/di/workspace/local2stream/mediaelement/.git/refs/heads [] ['master']
Upvotes: 2
Reputation: 310257
you can use glob.glob
or glob.iglob
to get a list/iterator of filenames.
e.g. if your directory has "file1.txt","file2.txt","file3.txt"
import glob
print (glob.glob('*.txt')) #['file1.txt','file2.txt','file3.txt']
Although the list won't necessarily be sorted.
Your loop can be written as:
import subprocess
import glob
for i in range(100):
files=glob.glob('prefix_foo_%d_*'%(i))
subprocess.call(['./process.py']+files)
Of course, using subprocess in python to run another python program is probably not the best design...(you could probably import the stuff you need from the other module and run it without spawning another process)
Upvotes: 4
Reputation: 76912
use the standard library glob. Assuming the functionality of process.py is in the function process_one_file:
from glob import glob
from process import process_one_file
for i in range(100):
process_one_file(glob('prefix_foo_{}_*'.format(i)))
Upvotes: 2