P_S
P_S

Reputation: 345

How to open files from a very large folder quickly?

I have a folder with about 220,000 text files. I need to open them in a certain order, and do something with the content. At the moment, I just use open, and it takes, on average, about half a second to open a file. Is there any faster way to do it?

If it matters - I'm on Windows.

Upvotes: 1

Views: 935

Answers (2)

O. Jones
O. Jones

Reputation: 108706

I have had similar problems in the past. In my case it was a directory full of jpeg images I was trying to process. They had similar names in the first few characters of the filenames, and that caused real performance trouble.

There's a legacy-compatibility feature in NTFS that assigns each filename a shadow filename that complies with the old DOS 8.3 filename limitation. Yep, it used to be that you could only name a file ABCDEFGH.EXT, and file names couldn't be longer. The legacy-compatility feature assigns a goofy alias name to every file that doesn't match 8.3, giving it a name like ABCDEF~1.EXT. When you have a lot of files the performance of this compatibility feature is talking-burro horrible.

I just checked my relatively new Windows 7 install, and the compatibility feature is still turned on.

You can turn off this feature for a whole volume using the fsutil program which you can read about here. You'll need a cmd or powershell window with admin privileges to do this.

 fsutil 8dot3name query h:

will tell you whether this feature is enabled on your h drive.

 fsutil 8dot3name set h: 1

will disable it entirely on your h drive. That might be destructive on your boot drive, especially if you have old-timey legacy software. When I implemented this, I made sure my directories containing tons of files were on a non-boot drive, and I left the boot drive alone.

You can strip the shadow 8.3 names from "all files that are located in a directory path" with this command

 fsutil 8dot3name strip /s h:\data\transactions

Stripping those names from your files in your big directory may help performance. (Back up the directory first, maybe with 7zip or something).

Read the documentation for fsutil before stripping those legacy filenames!

Upvotes: 1

Daniel
Daniel

Reputation: 42758

This is a problem of the underlying file system. Use a file system that is better suited for large amount of files. Built a directory tree, sort the files into directories after the first, second, third ... character of the filename.

Upvotes: 0

Related Questions