Lohith MV
Lohith MV

Reputation: 3938

how to get files count in a directory using ruby

using ruby how to get number of files in a given Directory,the file count should include count from recursive directories.

Eg: folder1(2 files) -----> folder2(4 files) and folder2 is inside folder1. total count for above case should be 6 files.

is there any function in ruby which fetch me this count.

Upvotes: 34

Views: 20617

Answers (9)

Mariusz
Mariusz

Reputation: 146

The current best answer does not count files in subdirectories starting with '.'. When we want to count files in subdirectories like '.git' or '.vs' then we need to use File::FNM_DOTMATCH argument but I am just beginner here not having 50 points so I can comment the best answer.

def folder_count(folder_path)
  Dir.glob("#{folder_path}/**/*", File::FNM_DOTMATCH).count { |file| File.file?(file) }
end

https://ruby-doc.org/core-2.5.0/Dir.html#method-c-5B-5D

"Note, this will not match Unix-like hidden files (dotfiles). In order to include those in the match results, you must use the File::FNM_DOTMATCH flag or something like "{,.}"."

Upvotes: 1

Matheus Porto
Matheus Porto

Reputation: 179

Using ~/Documents as example.

One line code:

Dir['~/Documents'].length

For longer paths one line can be less readable, so:

path = '~/Documents/foo/bar'

Dir[path].length

Upvotes: 1

peter
peter

Reputation: 42207

Just now had to find a way to get a list of files from a network share that was taking long with Dir.glob, Filelist from the rake gem seems to be the solution, benchmark follows. Share is on a windows server, script eran on a Windows 10 desktop, Ruby 2.3.0 X64. Netork share had 754 files, frow which 320 CSV's where I was looking for. Some of the files were in subfolders.

require 'rake'
require 'benchmark'

source_path = '//server/share/**/*.csv'
puts FileList.new(source_path).size #320
puts Dir.glob(source_path).length #320
puts Dir[source_path].length #320

Benchmark.bm do |x| 
  x.report("FileList  ") { 1.times { FileList.new(source_path) } }
  x.report("Dir.glob  ") { 1.times { Dir.glob(source_path) } }
  x.report("Dir[]     ") { 1.times { Dir[source_path] } } 
end 

Gives

       user     system      total        real
FileList    0.000000   0.000000   0.000000 (  0.000073)
Dir.glob    0.031000   0.406000   0.437000 ( 11.324227)
Dir[]       0.047000   0.563000   0.610000 ( 11.887771)

Old answer:

Fastest way in windows for very big folders would be to use the command line version of search everything like this, don't know if Linux has something like Search Everything.. If it does, please let us know.

ES = 'C:\Users\...\everything\es\es.exe'

def filelist path
  command = %Q{"#{ES}" "#{path}\\*"}
  list = []
  IO.popen(command+" 2>&1") do |pipe|
    while lijn = pipe.gets
      list << lijn
    end
  end
  list
end

filelist(path).count

see here the results for a relatively small folder (+800 files)

Benchmark.bmbm do |x| 
  x.report("glob      ") { filelist(path).count }
  x.report("everything") { Dir.glob("#{folder}/**/*").count } 
end 

Rehearsal ----------------------------------------------
glob         0.000000   0.032000   0.032000 (  0.106887)
everything   0.000000   0.000000   0.000000 (  0.001979)
------------------------------------- total: 0.032000sec

                 user     system      total        real
glob         0.016000   0.015000   0.031000 (  0.110030)
everything   0.000000   0.016000   0.016000 (  0.001881)

Upvotes: 1

dr_eck
dr_eck

Reputation: 81

A slight modification and a comment

Dir['**/*'].count { |file| File.file?(file) }

works for me in Ruby 1.9.3, and is shorter.

A caveat, at least on my Windows 7 box, is that Dir['somedir/**/*'] doesn't work. I have to use

cd(somedir) { Dir['**/*'] }

Upvotes: 8

Sarguru Nathan
Sarguru Nathan

Reputation: 11

How about the following:

find . -typef|wc -l

Also, what are the downsides of using this over Dir.count method?

Upvotes: 1

matsko
matsko

Reputation: 22203

You could also go super bare bones and do a system command:

count = `ls -1 #{dir} | wc -l`.to_i 

Upvotes: 3

Mario Uher
Mario Uher

Reputation: 12397

The fastest way should be (not including directories in count):

Dir.glob(File.join(your_directory_as_variable_or_string, '**', '*')).select { |file| File.file?(file) }.count

And shorter:

dir = '~/Documents'
Dir[File.join(dir, '**', '*')].count { |file| File.file?(file) }

Upvotes: 40

ControlPower
ControlPower

Reputation: 610

Please try:

//we suppose that the variable folder1 is an absolute path here
pattern = File.join(folder1, "**", "*")
count = Dir.glob(pattern).count

Upvotes: 0

Ray Toal
Ray Toal

Reputation: 88418

All you need is this, run in the current directory.

Dir["**/*"].length

It counts directories as files.

Upvotes: 29

Related Questions