Steve K
Steve K

Reputation: 387

Filter a ruby array based on values in another array

I have an array of extensions and an array of file names:

exts = ['.zip', '.tgz', '.sql']
files = ['file1.txt', 'file2.doc', 'file2.tgz', 'file3.sql', 'file6.foo', 'file4.zip']

I want to filter the file names by one or more matching extensions. In this case, the result would be:

["file1.zip", "file2.tgz", "file3.sql", "file4.zip"]

I know I can do this with a nested loop:

exts.each_with_object([]) do |ext, arr|
    files.each do |file| 
        arr << entry if file.include?(ext)
    end
end

This feels ugly to me. With select, I can avoid the feeling of nested loops:

files.select { |file| exts.each { |ext| file.include?(ext) } }

This works and feels better. Is there still a more elegant way that I'm missing?

Upvotes: 1

Views: 1167

Answers (3)

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230346

If you store extensions in a set, you'll reduce the runtime complexity from O(NM) to O(N).

exts = ['.zip', '.tgz', '.sql'].to_set
files.select { |file| exts.include?(File.extname(file)) }

Upvotes: 0

Steve K
Steve K

Reputation: 387

Thinking about it a bit further, I realized I could make the select better if I changed the logic slightly:

exts = ['zip', 'tgz', 'sql']
files.select{ |file| exts.include?(file) }

As far as I can tell, this is as clean as I can get.

Upvotes: 0

spickermann
spickermann

Reputation: 106882

I would use Enumerable#grep with a regexp like this:

exts = ['.zip', '.tgz', '.sql']
files = ['file1.txt', 'file2.doc', 'file2.tgz', 'file3.sql', 'file6.foo', 'file4.zip']

files.grep(/#{Regexp.union(exts)}$/)
#=> ["file2.tgz", "file3.sql", "file4.zip"]

I use Regexp.union instead of a simple exts.join('|') because exts include dots (.) which have a special meaning in regular expressions. Regexp.union escapes those dots automatically.

Upvotes: 2

Related Questions