Reputation: 63
I have the tree of above. I need to search in a recursive way the directories and files from the tree and return them as a dictionary in the following form -> key: directories/name of file and value: first line of file
eg: key:1/2/5/test5 value:first line of test 5
So far, i created the next code:
def search(root):
items = os.listdir(root)
for element in items:
if os.path.isfile(element):
with open (element) as file:
one_line=file.readline()
print(one_line)
elif os.path.isdir(element):
search(os.path.join(root,element))
The problem is that my code only searches the directories. Please make me understand where i'm wrong and how to solve it. Massive appreciation for any help, thank you!
Upvotes: 0
Views: 1336
Reputation: 1315
Your code is almost correct. It has to be adjusted a little, though. More specifically,
element
is a file or directory name (not path). If it is a subdirectory or file in a subdirectory the value of if os.path.isfile(element)
and elif os.path.isdir(element)
will be always False. Hence, replace them with if os.path.isfile(os.path.join(root, element))
and elif os.path.isdir(os.path.join(root, element))
respectively.
Similarly, with open(element)
should be replaced by with open(os.path.join(root,element))
.
When reading the file's first line, you have to store the path and that line in a dictionary.
That dictionary has to be updated when calling the recursive function in elif os.path.isdir(element)
.
See below for the complete snippet:
import os
def search(root):
my_dict = {} # this is the final dictionary to be populated
for element in os.listdir(root):
if os.path.isfile(os.path.join(root, element)):
try:
with open(os.path.join(root, element)) as file:
my_dict[os.path.join(root, element)] = file.readline() # populate the dictionary
except UnicodeDecodeError:
# This exception handling has been put here to ignore decode errors (some files cannot be read)
pass
elif os.path.isdir(os.path.join(root, element)):
my_dict.update(search(os.path.join(root,element))) # update the current dictionary with the one resulting from the recursive call
return my_dict
print(search('.'))
It prints a dictionary like below:
{
"path/file.csv": "name,surname,grade",
"path/to/file1.txt": "this is the first line of file 1",
"path/to/file2.py": "import os"
}
For the sake of readability, os.path.join(root, element)
can be stored in a variable, then:
import os
def search(root):
my_dict = {} # this is the final dictionary to be populated
for element in os.listdir(root):
path = os.path.join(root, element)
if os.path.isfile(path):
with open(path) as file:
my_dict[path] = file.readline()
elif os.path.isdir(path):
my_dict.update(search(path))
return my_dict
print(search('.'))
Upvotes: 1
Reputation: 11
You can use os.walk
The following function will not include empty folders.
def get_tree(startpath):
tree = {}
for root, dirs, files in os.walk(startpath):
for file in files:
path = root+"/"+file
with open(path,'r') as f:
first_line = f.readline()
tree[path] = first_line
return tree
The output will be like this:
{
file_path : first_line_of_the_file,
file_path2 : first_line_of_the_file2,
...
}
Upvotes: 1