Reputation: 2580
I have a package of the following format:
Electricity
|___ __main__.py
|
|__ Electricity
| |___ general_functions
| |___ regression_calcs
| | |___ create_calcs.py
| |
| |____ run_calcs.py
|
|
|
|__ Data_Input
|___ regression_vals
|__ regression_vals.csv
run_calcs.py
runs the code in regression_calcs
, which requires data from Data_Input/Regression_vals
.
What is the most pythonic way to find the number of ../
(number of times to go up a folder), until Data_Input
is found?
This is because right now I'm running the scripts in Electricity/Electricity/run_calcs.py
(for testing). Eventually I will be running in Electricity/__main__.py
.
it will be for df = pd.read_csv(f'{filepath}Data_Input/regression_vals/regression_vals.csv')
where filepath = '../'*n
Upvotes: 1
Views: 2263
Reputation: 4100
I wrote a solution:
from pathlib import Path, PosixPath
from typing import Union
def find_root_path(path: Union[str, Path], folder_name: str):
path = Path(path)
if path.parent == PosixPath('/'):
raise FileNotFoundError(f"Unable to find {folder_name}")
elif (path / folder_name).is_dir():
return path
else:
return find_root_path(path.parent)
If you want to find a file, use is_file()
instead of is_dir()
or, if you want to support both use .exists()
. Reference
Upvotes: 0
Reputation: 517
This answer is a modified version of A H's answer just with Micah Culpepper's exit condition and simplified.
import os
path = os.path.dirname(os.path.abspath(__file__))
while "Input_Data" not in os.listdir(path):
if path == os.path.dirname(path):
raise FileNotFoundError("could not find Input_Data")
path = os.path.dirname(path)
Upvotes: 0
Reputation: 537
Here is an alternate implementation using pathlib
and directly returning a Path object for the desired directory.
from pathlib import Path
def get_path_to_rel_location(directory_to_find):
"""Goes up in directory heirarchy until it finds directory that contains
`directory_to_find` and returns Path object of `directory_to_find`"""
path = Path.cwd()
num_tries = 5
for num_up_folder in range(num_tries):
path = path.parent
if path / directory_to_find in path.iterdir():
break
if num_tries == num_up_folder:
raise FileNotFoundError(f"The directory {directory_to_find} could not be found in the {num_tries}"
f" directories above this file's location.")
return path / directory_to_find
# Example usage
path = get_path_to_rel_location("Input_Data")
Upvotes: 1
Reputation: 1301
You can use Unipath.
path = Path("/Electricity/Data_Input/regression_vals/regression_vals.csv")
path = path.parent
path = path.parent
And now path
refers to /Electricity/Data_Input directory.
Upvotes: 2
Reputation: 2580
What I eventually used (a mix between avix & pstatic's answer):
import os, unipath
def rel_location():
"""Goes up until it finds the folder 'Input_Data', then it stops
returns '' or '../' or '../../', or ... depending on how many times it had to go up"""
path = unipath.Path(__file__)
num_tries = 5
for num_up_folder in range(num_tries):
path = path.parent
if 'Input_Data' in os.listdir(path):
break
if num_tries == num_up_folder:
raise FileNotFoundError("The directory 'Input_Data' could not be found in the 5"
" directories above this file's location. ")
location = '../'* num_up_folder
return location
Upvotes: 1
Reputation: 11
os.scandir is useful for stuff like this.
def find_my_cousin(me, cousin_name):
"""Find a file or directory named `cousin_name`. Start searching at `me`,
and traverse directly up the file tree until found."""
if not os.path.isdir(me):
parent_folder = os.path.dirname(me)
else:
parent_folder = me
folder = None
removed = -1
while folder != parent_folder: # Stop if we hit the file system root
folder = parent_folder
removed += 1
with os.scandir(folder) as ls:
for f in ls:
if f.name == cousin_name:
print(
"{} is your cousin, {} times removed, and she lives at {}"
"".format(f.name, removed, f.path)
)
return f.path
parent_folder = os.path.normpath(os.path.join(folder, os.pardir))
Upvotes: 0
Reputation: 3858
Inside your files within regression_calcs
:
from os import listdir
from os.path import join, isdir, dirname, basename
filepath = None
# get parent of the .py running
par_dir = dirname(__file__)
while True:
# get basenames of all the directories in that parent
dirs = [basename(join(par_dir, d)) for d in listdir(par_dir) if isdir(join(par_dir, d))]
# the parent contains desired directory
if 'Data_Input' in dirs:
filepath = par_dir
break
# back it out another parent otherwise
par_dir = dirname(par_dir)
Of course this only works if you have a single '/Data_Input/'
directory!
Upvotes: 3