user248237
user248237

Reputation:

limits of number of files in a single directory in unix/linux using Python

is it bad to output many files to the same directory in unix/linux? I run thousands of jobs on a cluster and each outputs a file, to one directory. The upper bound here is around ~50,000 files. Can IO be limited in speed in light of this? If so, does the problem go away with a nested directory structure?

Thanks.

Upvotes: 2

Views: 1832

Answers (3)

ghostdog74
ghostdog74

Reputation: 342463

My suggestion is to use nested directory structure (ie categorization). You can name them using timestamps, special prefixes for each application etc. This gives you a sense of order when you need to search for specific files and for easier management of your files.

Upvotes: 0

maerics
maerics

Reputation: 156444

I believe that most filesystems store the names of contained files in a list (or some other linear-time access data structure) so storing large numbers of files in a single directory can cause slowness for simple operations like listing. Having a nested structure can ameliorate this problem by creating a tree structure (or even a Trie, if it makes sense) of names which can reduce the time it takes to retrieve file stats.

Upvotes: 0

Related Questions