Z R
Z R

Reputation: 121

Making os.walk work in a non-standard way

I'm trying to do the following, in this order:

Use os.walk() to go down each directory.
Each directory has subfolders, but I'm only interested in the first subfolder. So the directory looks like:

/home/RawData/SubFolder1/SubFolder2

For example. I want, in RawData2, to have folders that stop at the SubFolder1 level.

The thing is, it seems like os.walk() goes down through ALL of the RawData folder, and I'm not certain how to make it stop.

The below is what I have so far - I've tried a number of other combinations of substituting variable dirs for root, or files, but that doesn't seem to get me what I want.

import os 

for root, dirs, files in os.walk("/home/RawData"): 

    os.chdir("/home/RawData2/")
    make_path("/home/RawData2/"+str(dirs))

Upvotes: 7

Views: 96

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 149185

Beware: Documentation for os.walk says:

don’t change the current working directory between resumptions of walk(). walk() never changes the current directory, and assumes that its caller doesn’t either

so you should avoid os.chdir("/home/RawData2/") in the walk loop.

You can easily ask walk not to recurse by using topdown=True and clearing dirs:

for root, dirs, files in os.walk("/home/RawData", True):
    for rep in dirs:
        make_path(os.join("/home/RawData2/", rep )
        # add processing here
    del dirs[]  # tell walk not to recurse in any sub directory

Upvotes: 1

idjaw
idjaw

Reputation: 26600

I suggest you use glob instead.

As the help on glob describes:

glob(pathname)
    Return a list of paths matching a pathname pattern.

    The pattern may contain simple shell-style wildcards a la
    fnmatch. However, unlike fnmatch, filenames starting with a
    dot are special cases that are not matched by '*' and '?'
    patterns.

So, your pattern is every first level directory, which I think would be something like this:

/root_path/*/sub_folder1/sub_folder2

So, you start at your root, get everything in that first level, and then look for sub_folder1/sub_folder2. I think that works.

To put it all together:

from glob import glob

dirs = glob('/root_path/*/sub_folder1/sub_folder2')

# Then iterate for each path
for i in dirs:
    print(i)

Upvotes: 1

Related Questions