A Newbie
A Newbie

Reputation: 113

Finding a csv file by traversing sub directories

I have a folder structure as below : Referring to the structure below, I am trying to access from the python file named 'MyFile.py', the csv file "sample4.csv" that is within the subdirectory "2021" (folder within 'yearlydata_subfolder2').

I've tried using os.path with the code below, but that doesn't seem to work and it returns 'FileNotFoundError'. Could someone please help?

C:.
├───1. Parent Folder : 
│       'Myfile.py'

   ├───2. 'data'  (subfolder1)

       ├───3. 'AWSyearly'                      (subfolder2)
           ├───3.1. '2020'                     (folder within 'yearlydata_subfolder2')
               sample1.csv                     (file in folder '2020')
               ├───3.1.1 monthlydata_2020      (subfolder within 2020)
                   sample2_2020_Jan.csv
                   sample3_2020_Feb.csv

           ├───3.2 '2021'                      (folder within 'yearlydata_subfolder2')
               sample4.csv                     (file in folder '2021') 
               ├───3.2.1 monthlydata_2021      (subfolder within 2021)
                   sample5_2021_Jan.csv
                   sample6_2021_Feb.csv

Code:

def access_csv():
    path = os.path.realpath(os.path.join(os.getcwd(),os.path.dirname(__file__)))
    print("Path of csv is:", path)
    for root, dirs, files in os.walk(path):
        for file in files:
            if file.startswith('sample4') and file.endswith(".csv"):
                df = pd.read_csv(file, decimal=",", delimiter=";", index_col=0)
    return df

Upvotes: 0

Views: 62

Answers (1)

ayy_lmao
ayy_lmao

Reputation: 25

Try using glob:

import glob

parent_dir_fp = r"/parent_folder" # update this

matching_filepaths = glob.glob(parent_dir_fp + "/**/2021/sample4.csv", recursive=True)
# do whatever with filepaths from here

Upvotes: 1

Related Questions