Reputation: 245
I'm making a program which the user build directories (not in windows, in my app) and in these folders there are subfolders and so on; every folder must contain either folders or documents. What is the best data structure to use? Notice that the user may select a subfolder and search for documents in it and in its subfolders. And I don't want to limit the folders or the subfolders levels.
Upvotes: 13
Views: 21807
Reputation: 1
A flat list has its advantages, such as easier handling, and possibly smaller space usage. To avoid repetition, just sort your full paths, then store for each item (file or folder) by its name without path.
level:int name:string (+ other attributes such as mtime, size)
Items on the list are implicitly children of the preceding lower level entry. All contents of any directory are found in rows directly following it until lower level.
The OP structure becomes (to omit root remove the row and decrement levels)
0 Root
1 Folder1
1 Folder2
2 Folder3
1 Folder4
2 Folder5
3 Folder6
As an alternative still, instead of level store for each entry how many next entries are inside it (possibly deeper in recursion). Makes it even easier to grab any folder recursively but makes the list hard to manipulate, so I don't like it. Easy enough with the levels:
loc = []
targetlevel = None
for [level, name, attrs] in table:
# Stop at end of directory
if targetlevel is not None and level <= targetlevel:
break
# Construct path fragments in loc for a full path
loc = loc[:level] + [name]
fullpath = "/".join(loc)
if fullpath == "path/to/search":
targetlevel = level
# If found, report back all items that belong to target
if targetlevel is not None:
yield fullpath, attrs
Upvotes: 0
Reputation: 145
I should recommend B+ Tree .... You can easily use indexing (page,folder etc ) and all .
B+ Tree http://commons.wikimedia.org/wiki/File:Btree.png
for more info : http://ozark.hendrix.edu/~burch/cs/340/reading/btree/index.html
Upvotes: 2
Reputation: 37655
Most OO languages come with some sort of abstraction for the file system, so there is where I would start. Then subclass it if you need to.
I would expect directories as an array of objects which are directories or files, for instance.
Upvotes: 0
Reputation: 103155
I know that the question is specifically asking for a data structure but...
If you are using an object oriented language maybe you can use the composite design pattern which is ideally suited for this type of hierarchical tree like structure. You get what you are asking for.
Upvotes: 0
Reputation: 17099
This is what I do:
Every record in the database has two fields: ID and ParentID. IDs are 4-5 characters (Base36, a-z:0-9 or something similar). Parent IDs are a concatenation of the parent's complete structure...
So...
This structure:
Root
Folder1
Folder2
Folder3
Folder4
Folder5
Folder6
Would be represented like this:
ID ParentID Name
0000 NULL ROOT
0001 0000 Folder1
0002 0000 Folder2
0003 00000002 Folder3
0004 0000 Folder4
0005 00000004 Folder5
0006 000000040005 Folder6
I like this structure because if I need to find all the files under a folder I can do a query like:
SELECT * FROM Folders WHERE ParentID LIKE '0000%' -- to find all folders under Folder1
To delete a folder and all its children:
DELETE FROM Folders WHERE ID='0004' AND ParentID LIKE '00000004%'
To move a folder and its children, you have to update all the records that use the same parent, to the new parent.
And I don't want to linit the folders or the subfolders levels
An obvious limitation to this is that the number of subfolders are limited to the size of your ParentID field.
Upvotes: 18
Reputation: 41667
I can think of a few ways you could structure this, but nothing would beat the obvious:
Use the actual file system.
Upvotes: 7