Homde
Homde

Reputation: 4286

Store and search directory tree in memory efficiently

I want to store all directories on a huge drive as efficiently in memory as possible and also be able to retrieve a directory given it's full path. Each directory has fields for it's name (not it's full path) and a pointer to it's parent and a list of subdirectories. Which way do you think the way to go is?

As I see it there's a couple of ways:

a) Store the full paths to each directory in a dictionary and do a simple lookup. Pros: fast, Cons: each full path string takes up uneccessary and redundant amount of memory

b) Store just the actual directory name in a dictionary with a list of all directories with that name, then check the matches if it's correct: Pros: pretty fast, Cons: has to either store a list for each directory or use boxing to store either a list or directory in the dictionary.

c) Skip the dictionary, traverse the tree from the root and find a match by splitting the path. Perhaps use PLINQ to speed things up. Pros: No memory overhead with the dictionary, cons: potentially slower than lookup.

d) some other way i haven't thought of...

Upvotes: 4

Views: 1806

Answers (2)

Jon Hanna
Jon Hanna

Reputation: 113232

If you could store the subdirectories as a dictionary rather than as a list (and for cases where you want all the subdirectories, that's easily done using the Values property) then you can step through the path with each step being O(1) and hence the complexity of finding the directory from the full path being O(n) where n is the number of steps in the path, not related to the number of directories in the system.

Upvotes: 3

TomTom
TomTom

Reputation: 62093

Use a atabase. Point. Problem is the efficient search if the tree is not trivially small. It needs an index.

Skip the dictionary, make an enumerator that traverses the whole tree and finds a match

Not "efficient" but the worsst possible solution time wise that is not a total waste programming wise and making things slower than a no brainer.

The problem is that efficient partial lookups require an index which is a lot of programming to maintain, compared to use something like SqlLite in the temp directory.

Upvotes: -1

Related Questions