movermeyer
movermeyer

Reputation: 1812

How to get filepaths that match a glob without having them on the filesystem

I have a list of filepaths relative to a root directory, and am trying to determine which would be matched by a glob pattern. I'm trying to get the same results that I would get if all the files were on my filesystem and I ran Dir.glob(<my_glob_pattern>) from the root diectory.

If this is the list of filepaths:

foo/index.md
foo/bar/index.md
foo/bar/baz/index.md
foo/bar/baz/qux/index.md

and this is the glob pattern:

foo/bar/*.md

If the files existed on my filesystem, Dir.glob('foo/bar/*.md') would return only foo/bar/index.md.

The glob docs mention fnmatch, and I tried using it but found that the pattern foo/bar/*.md was matching .md files in any number of nested subdirectories, similar to what Dir.glob('foo/bar/**/*.md') would, not just the direct children of the foo/bar directory:

my_glob = 'foo/bar/*.md'

filepaths = [
  'foo/index.md',
  'foo/bar/index.md',
  'foo/bar/baz/index.md',
  'foo/bar/baz/qux/index.md',
]

# Using the provided filepaths
filepaths_that_match_pattern = filepaths.select{|path| File.fnmatch?(my_glob, path)}.sort

# If the filepaths actually existed on my filesystem
filepaths_found_by_glob = Dir.glob(my_glob).sort

raise Exception.new("They don't match!") unless filepaths_that_match_pattern == filepaths_found_by_glob

I [incorrectly] expected the above code to work, but filepaths_found_by_glob only contains the direct children, while filepaths_that_match_pattern contains all the nested children too.

How can I get the same results as Dir.glob without having the file paths on my filesystem?

Upvotes: 0

Views: 880

Answers (2)

milind pincha
milind pincha

Reputation: 21

You can use the flag File::FNM_PATHNAME while calling File.fnmatch function. So your function call would look like this - File.fnmatch(pattern, path, File::FNM_PATHNAME)

You can see examples related to its usage here: https://apidock.com/ruby/File/fnmatch/class

Upvotes: 2

the Tin Man
the Tin Man

Reputation: 160551

Don't use File.fnmatch, instead use Pathname.fnmatch:

require 'pathname'

PATTERN = 'foo/bar/*.md'

%w[
  foo/index.md
  foo/bar/index.md
  foo/bar/baz/index.md
  foo/bar/baz/qux/index.md
].each do |p|

  puts 'path: %-24s %s' % [
    p, 
    Pathname.new(p).fnmatch(PATTERN) ? 'matches' : 'does not match'
  ]
end

# >> path: foo/index.md             does not match
# >> path: foo/bar/index.md         matches
# >> path: foo/bar/baz/index.md     matches
# >> path: foo/bar/baz/qux/index.md matches

File assumes the existence of files or paths on the drive whereas Pathname:

Pathname represents the name of a file or directory on the filesystem, but not the file itself.

Also, regarding using Dir.glob: Be careful using it. It immediately attempts to find every file or path on the drive that matches and returns the hits. On a big or slow drive, or with a pattern that isn't written well, such as when debugging or testing, your code can be tied up for a long time or make Ruby or the machine Ruby's running on go to a crawl, and it only gets worse if you're checking a shared or remote drive. As an example of what can happen, try the following at your command-line, but be prepared to hit Cntrl+C to regain control:

ls /**/*

Instead, I recommend using the Find class in the Standard Library as it will iterate over the matches. See that documentation for examples.

Upvotes: -1

Related Questions