Reputation: 3938
using ruby how to get number of files in a given Directory,the file count should include count from recursive directories.
Eg: folder1(2 files) -----> folder2(4 files) and folder2 is inside folder1. total count for above case should be 6 files.
is there any function in ruby which fetch me this count.
Upvotes: 34
Views: 20617
Reputation: 146
The current best answer does not count files in subdirectories starting with '.'. When we want to count files in subdirectories like '.git' or '.vs' then we need to use File::FNM_DOTMATCH argument but I am just beginner here not having 50 points so I can comment the best answer.
def folder_count(folder_path)
Dir.glob("#{folder_path}/**/*", File::FNM_DOTMATCH).count { |file| File.file?(file) }
end
https://ruby-doc.org/core-2.5.0/Dir.html#method-c-5B-5D
"Note, this will not match Unix-like hidden files (dotfiles). In order to include those in the match results, you must use the File::FNM_DOTMATCH flag or something like "{,.}"."
Upvotes: 1
Reputation: 179
Using ~/Documents
as example.
One line code:
Dir['~/Documents'].length
For longer paths one line can be less readable, so:
path = '~/Documents/foo/bar'
Dir[path].length
Upvotes: 1
Reputation: 42207
Just now had to find a way to get a list of files from a network share that was taking long with Dir.glob, Filelist from the rake gem seems to be the solution, benchmark follows. Share is on a windows server, script eran on a Windows 10 desktop, Ruby 2.3.0 X64. Netork share had 754 files, frow which 320 CSV's where I was looking for. Some of the files were in subfolders.
require 'rake'
require 'benchmark'
source_path = '//server/share/**/*.csv'
puts FileList.new(source_path).size #320
puts Dir.glob(source_path).length #320
puts Dir[source_path].length #320
Benchmark.bm do |x|
x.report("FileList ") { 1.times { FileList.new(source_path) } }
x.report("Dir.glob ") { 1.times { Dir.glob(source_path) } }
x.report("Dir[] ") { 1.times { Dir[source_path] } }
end
Gives
user system total real
FileList 0.000000 0.000000 0.000000 ( 0.000073)
Dir.glob 0.031000 0.406000 0.437000 ( 11.324227)
Dir[] 0.047000 0.563000 0.610000 ( 11.887771)
Old answer:
Fastest way in windows for very big folders would be to use the command line version of search everything like this, don't know if Linux has something like Search Everything.. If it does, please let us know.
ES = 'C:\Users\...\everything\es\es.exe'
def filelist path
command = %Q{"#{ES}" "#{path}\\*"}
list = []
IO.popen(command+" 2>&1") do |pipe|
while lijn = pipe.gets
list << lijn
end
end
list
end
filelist(path).count
see here the results for a relatively small folder (+800 files)
Benchmark.bmbm do |x|
x.report("glob ") { filelist(path).count }
x.report("everything") { Dir.glob("#{folder}/**/*").count }
end
Rehearsal ----------------------------------------------
glob 0.000000 0.032000 0.032000 ( 0.106887)
everything 0.000000 0.000000 0.000000 ( 0.001979)
------------------------------------- total: 0.032000sec
user system total real
glob 0.016000 0.015000 0.031000 ( 0.110030)
everything 0.000000 0.016000 0.016000 ( 0.001881)
Upvotes: 1
Reputation: 81
A slight modification and a comment
Dir['**/*'].count { |file| File.file?(file) }
works for me in Ruby 1.9.3, and is shorter.
A caveat, at least on my Windows 7 box, is that Dir['somedir/**/*']
doesn't work. I have to use
cd(somedir) { Dir['**/*'] }
Upvotes: 8
Reputation: 11
How about the following:
find . -typef|wc -l
Also, what are the downsides of using this over Dir.count method?
Upvotes: 1
Reputation: 22203
You could also go super bare bones and do a system command:
count = `ls -1 #{dir} | wc -l`.to_i
Upvotes: 3
Reputation: 12397
The fastest way should be (not including directories in count):
Dir.glob(File.join(your_directory_as_variable_or_string, '**', '*')).select { |file| File.file?(file) }.count
And shorter:
dir = '~/Documents'
Dir[File.join(dir, '**', '*')].count { |file| File.file?(file) }
Upvotes: 40
Reputation: 610
Please try:
//we suppose that the variable folder1 is an absolute path here
pattern = File.join(folder1, "**", "*")
count = Dir.glob(pattern).count
Upvotes: 0
Reputation: 88418
All you need is this, run in the current directory.
Dir["**/*"].length
It counts directories as files.
Upvotes: 29