Reputation: 101
I need to replicate the functionality of a file directory tree as a list. I have to be able to search for specific "documents" through the "folders". All of which may contain duplicate names at other depths. I also have to be able to dynamically add new files and folders during runtime. So for example, a file tree like this:
MyFiles
Important
doc1
doc2
LessImportant
doc3
doc4
LowPriority
Important
doc1
LessImportant
doc4
If I use nested lists, the above tree would end up looking like:
[MyFiles,[Important,[doc1,doc2],LessImportant,[doc3,doc4],LowPriority,
[Important,[doc1],LessImportant,[doc4]]]]
And then I would have to run loops through all the nests to search for stuff and use .append to add new "folders" or "documents".
Is there a better / more efficient way than nested lists?
Upvotes: 0
Views: 211
Reputation: 1773
Using ElementTree gives search and iterate functions.
import os
import xml.etree.ElementTree as ET
def ls(p):
if os.path.isdir(p):
node = ET.Element(os.path.basename(p), type='dir')
node.extend([ls(os.path.join(p, f)) for f in os.listdir(p)])
else:
node = ET.Element(os.path.basename(p), type='file')
return node
Then testing this by writing out as XML as that is quite easy from ElementTree:
root = ET.ElementTree(ls(r"C:\test\Myfiles"))
from xml.dom import minidom
def pp(tree):
print ''.join(minidom.parseString(ET.tostring(tree.getroot())).toprettyxml(indent=' ').splitlines(True)[1:])
pp(root)
Gives
<Myfiles type="dir">
<Important type="dir">
<doc1 type="file"/>
<doc2 type="file"/>
</Important>
<LessImportant type="dir">
<doc1 type="file"/>
<doc2 type="file"/>
</LessImportant>
<LowPriority type="dir">
<Important type="dir">
<doc1 type="file"/>
</Important>
<LessImportant type="dir">
<doc4 type="file"/>
</LessImportant>
</LowPriority>
</Myfiles>
You'll can play around to decide if the dir
or file
should be an element tag or attribute.
Upvotes: 1
Reputation: 1065
At first sight you might get the impression "Nah, that's too many lines of code" but it does have some great advantages (e.g. you're way more flexible).
Class / Basic Construct
class FileOrFolder:
def __init__(self, name, children=None):
self.name = name
self.children = children if children else []
def search_for(self, f_name):
global hits # defined later on
for child in self.children:
if child.name == f_name:
hits.append(child.name)
if child.children:
child.search_for(f_name)
Recreating the File Tree
TREE = FileOrFolder("MyFiles", [
FileOrFolder("Important", [
FileOrFolder("doc1"),
FileOrFolder("doc2")
]),
FileOrFolder("LessImportant", [
FileOrFolder("doc3"),
FileOrFolder("doc4")
]),
FileOrFolder("LowPriority", [
FileOrFolder("Important", [
FileOrFolder("doc1")
]),
FileOrFolder("LessImportant", [
FileOrFolder("doc4")
])
])
])
Application & Ouput
>>> hits = []
>>> TREE.search_for("doc4")
>>> print(hits)
['doc4', 'doc4']
NOTE: However, I don't know if your overall goal is to simply create a file tree manually or automatically iterate through an existing&real one and "copy it". In case it's the latter you would need to make some slight changes.
Upvotes: 0
Reputation: 314
What about such a structure using the dict datatype:
{"ID": 0, "Type": 'Folder', "Name": 'MyFiles', "Subdirectories": [1, 2, 3]}
{"ID": 1, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 2, "Type": 'Folder', "Name": 'LessImportant', "Subdirectories": []}
{"ID": 3, "Type": 'Folder', "Name": 'LowPriority', "Subdirectories": [4, 5]}
{"ID": 4, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 5, "Type": 'Folder', "Name": 'LessImmportant', "Subdirectories": []}
{"ID": 0, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 1}
{"ID": 1, "Type": 'File', "Name": 'doc2', 'ParentDirectory': 1}
{"ID": 2, "Type": 'File', "Name": 'doc3', 'ParentDirectory': 2}
{"ID": 3, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 2}
{"ID": 4, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 4}
{"ID": 5, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 5}
Which would let you parse the data in a recursive manner. Here the files are numerated seperately from folders. Each file has the Parentdirectory entry which is the current directory the file is in. The folders have a list of subdirectories and all elements are linked through the ID datafield.
Upvotes: 0