BerickCook
BerickCook

Reputation: 101

How to replicate a file tree as a Python list?

I need to replicate the functionality of a file directory tree as a list. I have to be able to search for specific "documents" through the "folders". All of which may contain duplicate names at other depths. I also have to be able to dynamically add new files and folders during runtime. So for example, a file tree like this:

MyFiles
    Important
        doc1
        doc2
    LessImportant
        doc3
        doc4
    LowPriority
        Important
            doc1
        LessImportant
            doc4

If I use nested lists, the above tree would end up looking like:

[MyFiles,[Important,[doc1,doc2],LessImportant,[doc3,doc4],LowPriority, 
[Important,[doc1],LessImportant,[doc4]]]]

And then I would have to run loops through all the nests to search for stuff and use .append to add new "folders" or "documents".

Is there a better / more efficient way than nested lists?

Upvotes: 0

Views: 211

Answers (3)

Mike Robins
Mike Robins

Reputation: 1773

Using ElementTree gives search and iterate functions.

import os
import xml.etree.ElementTree as ET

def ls(p):
    if os.path.isdir(p):
        node = ET.Element(os.path.basename(p), type='dir')
        node.extend([ls(os.path.join(p, f)) for f in os.listdir(p)])
    else:
        node = ET.Element(os.path.basename(p), type='file')
    return node

Then testing this by writing out as XML as that is quite easy from ElementTree:

root = ET.ElementTree(ls(r"C:\test\Myfiles"))

from xml.dom import minidom
def pp(tree):
    print ''.join(minidom.parseString(ET.tostring(tree.getroot())).toprettyxml(indent='  ').splitlines(True)[1:])

pp(root)

Gives

<Myfiles type="dir">
  <Important type="dir">
    <doc1 type="file"/>
    <doc2 type="file"/>
  </Important>
  <LessImportant type="dir">
    <doc1 type="file"/>
    <doc2 type="file"/>
  </LessImportant>
  <LowPriority type="dir">
    <Important type="dir">
      <doc1 type="file"/>
    </Important>
    <LessImportant type="dir">
      <doc4 type="file"/>
    </LessImportant>
  </LowPriority>
</Myfiles>

You'll can play around to decide if the dir or file should be an element tag or attribute.

Upvotes: 1

NewNewton
NewNewton

Reputation: 1065

The OOP Approach

At first sight you might get the impression "Nah, that's too many lines of code" but it does have some great advantages (e.g. you're way more flexible).

Class / Basic Construct

class FileOrFolder:

    def __init__(self, name, children=None):
        self.name = name
        self.children = children if children else []

    def search_for(self, f_name):
        global hits  # defined later on

        for child in self.children:

            if child.name == f_name:
                hits.append(child.name)

            if child.children:
                child.search_for(f_name)

Recreating the File Tree

TREE = FileOrFolder("MyFiles", [
    FileOrFolder("Important", [
        FileOrFolder("doc1"),
        FileOrFolder("doc2")
    ]),
    FileOrFolder("LessImportant", [
        FileOrFolder("doc3"),
        FileOrFolder("doc4")
    ]),
    FileOrFolder("LowPriority", [
        FileOrFolder("Important", [
            FileOrFolder("doc1")
        ]),
        FileOrFolder("LessImportant", [
            FileOrFolder("doc4")
        ])
    ])
])

Application & Ouput

>>> hits = []
>>> TREE.search_for("doc4")
>>> print(hits)

['doc4', 'doc4']

NOTE: However, I don't know if your overall goal is to simply create a file tree manually or automatically iterate through an existing&real one and "copy it". In case it's the latter you would need to make some slight changes.

Upvotes: 0

What
What

Reputation: 314

What about such a structure using the dict datatype:

{"ID": 0, "Type": 'Folder', "Name": 'MyFiles', "Subdirectories": [1, 2, 3]}
{"ID": 1, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 2, "Type": 'Folder', "Name": 'LessImportant', "Subdirectories": []}
{"ID": 3, "Type": 'Folder', "Name": 'LowPriority', "Subdirectories": [4, 5]}
{"ID": 4, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 5, "Type": 'Folder', "Name": 'LessImmportant', "Subdirectories": []}

{"ID": 0, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 1}
{"ID": 1, "Type": 'File', "Name": 'doc2', 'ParentDirectory': 1}
{"ID": 2, "Type": 'File', "Name": 'doc3', 'ParentDirectory': 2}
{"ID": 3, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 2}
{"ID": 4, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 4}
{"ID": 5, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 5}

Which would let you parse the data in a recursive manner. Here the files are numerated seperately from folders. Each file has the Parentdirectory entry which is the current directory the file is in. The folders have a list of subdirectories and all elements are linked through the ID datafield.

Upvotes: 0

Related Questions