Luk Aron
Luk Aron

Reputation: 1425

Data structure for files and folder tree?

For python, is there any common data structure for representing the tree in the file system in all OSs?

I currently use a dictionary and store the tree in this way.

C/
├─ C1
├─ C3/
│  ├─ C31
tree = 

{"title": "root",
 "child": [
     {"title": "C",
      "child": [{"title": "C1"

                 },
                {"title": "C3",
                 "child": [
                     {"title": "C31"

                      }
                 ]
                 }
                ]
      }
 ]}

however, this naive way lacks a lot of functions, such as comparing two trees, calculating a number of files, etc.

Is there any package that could do these functions and have a data structure specifically dealing with folder and file trees?

Upvotes: 2

Views: 2617

Answers (1)

lmonninger
lmonninger

Reputation: 961

Few ideas come to mind:

  1. If this is just in memory, you could create a tree in a more conventional OOP manner by doing something like:

    class Dir:
    
        parent : Dir
        children : List[Dir]
    
        def __init__(self, parent=None, children=[])
            self.parent = parent
            self.children = children
    
        # etc, those nice member functions below
    
  2. You could also represent your tree as a list of strings, and have the folders be a purely grammatical construct:

    ["thing/files/file_a.py", "thing/files/file_2.py", ...]
    

Interpreting the number of folders, children in subtree, and whatever else, could be accomplished by parsing the complete file or folder names.

  1. Building on 2, if you need to actually work with attributes of the folder, e.g. permissions, you could instead opt for dict where the file or folder names are the keys.
    {
       "thing/files/file_a.py" : { ...something}
       "thing/files" : {...something else}
    }
    

These last two methods work well when you're not just working in memory, e.g., a REST API. You can also do both with tuples as the keys.

  1. In my opinion, nested dictionaries of arbitrary depth are rarely the best way to go. All of the solutions proposed above actually tug on same thread: represent your tree as a set of relationships instead of some kind of composition.

Depending on what you are actually trying to accomplish with storing your folder structure, there may well be better solutions. (Maybe yours is the best one.) But, generally, it helps to think a bit creatively about what kinds of data actually need to be stored.

Upvotes: 3

Related Questions