James Newton
James Newton

Reputation: 7104

bash: using find with multiple file types, provided as an array

In a bash function, I want to list all the files in a given folder which correspond to a given set of file types. In pseudo-code, I am imagining something like this:

getMatchingFiles() {
  output=$1
  directory=$2
  shift 2
  _types_=("$@")

  file_array=find $directory -type f where-name-matches-item-in-_types_

  # do other stuff with $file_array, such as trimming file names to
  # just the basename with no extension

  eval $output="${file_array[@]}"
}

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types
echo "${result[@]}"

For your amusement, here are the multiple workarounds, based on my current knowledge of bash, that I am using to achieve this. I have a problem with the way the function returns the array of files: the final command tries to execute each file, rather than to set the output parameter.

getMatchingFiles() {
  local _output=$1
  local _dir=$2
  shift 2
  local _type=("$@")
  local _files=($_dir/$_type/*)
  local -i ii=${#_files[@]}
  local -a _filetypes
  local _file _regex

  case $_type in
    audio )
      _filetypes=(ogg mp3)
      ;;
    images )
      _filetypes=(jpg png)
      ;;
  esac

  _regex="^.*\.("
  for _filetype in "${_filetypes[@]}"
  do
     _regex+=$_filetype"|"
  done

  _regex=${_regex:0:-1}
  _regex+=")$"

  for (( ; ii-- ; ))
  do
    _file=${_files[$ii]}
    if ! [[ $_file =~ $_regex ]];then
      unset _files[ii]
    fi
  done

  echo "${_files[@]}"

  # eval $_output="${_files[@]}" # tries to execute the files
}

dir=/path/to/parent
getMatchingFiles result $dir audio
echo "${result[@]}"

Upvotes: 3

Views: 1589

Answers (3)

PesaThe
PesaThe

Reputation: 7509

As a matter of fact, it is possible to use nameref (note that you need bash 4.3 or later) to reference an array. If you want to put the output of find to an array specified by a name, you can reference it like this:

#!/usr/bin/env bash

getMatchingFiles() {

   local -n output=$1
   local dir=$2
   shift 2
   local types=("$@")
   local ext file
   local -a find_ext

   [[ ${#types[@]} -eq 0 ]] && return 1

   for ext in "${types[@]}"; do
      find_ext+=(-o -name "*.${ext}")
   done

   unset 'find_ext[0]'
   output=()

   while IFS=  read -r -d $'\0' file; do
      output+=("$file") 
   done < <(find "$dir" -type f \( "${find_ext[@]}" \) -print0)
}

dir=/some/path

getMatchingFiles result "$dir" mp3 txt
printf '%s\n' "${result[@]}"

getMatchingFiles other_result /some/other/path txt
printf '%s\n' "${other_result[@]}"

Don't pass your variable $dir as a reference, pass it as a value instead. You will be able to pass a literal as well.

Upvotes: 2

Charles Duffy
Charles Duffy

Reputation: 295687

Supporting the original, unmodified calling convention, and correctly handling extensions with whitespace or glob characters:

#!/usr/bin/env bash

getMatchingFiles() {
  declare -g -a "$1=()"
  declare -n gMF_result="$1"  # variables are namespaced to avoid conflicts w/ targets
  declare -n gMF_dir="$2"
  declare -n gMF_types="$3"
  local gMF_args=( -false )   # empty type list not a special case
  local gMF_type gMF_item

  for gMF_type in "${gMF_types[@]}"; do
    gMF_args+=( -o -name "*.$gMF_type" )
  done

  while IFS= read -r -d '' gMF_item; do
    gMF_result+=( "$gMF_item" )
  done < <(find "$gMF_dir" '(' "${gMF_args[@]}" ')' -print0)
}

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types

Upvotes: 0

Renaud Pacalet
Renaud Pacalet

Reputation: 29290

Update: namerefs can indeed be arrays (see PesaThe's answer)

Without spaces in file and directory names

I first assume you do not have spaces in your file and directory names. See the second part of this answer if you have spaces in your file and directory names.

In order to pass result, dir and types by name to your function, you need to use namerefs (local -n or declare -n, available only in recent versions of bash).

Another difficulty is to build the find command based on the types you passed but this is not a major one. Pattern substitutions can do this. All in all, something like this should do about what you want:

#!/usr/bin/env bash

getMatchingFiles() {
    local -n output=$1
    local -n directory=$2
    local -n _types_=$3
    local filter

    filter="${_types_[@]/#/ -o -name *.}"
    filter="${filter# -o }"
    output=( $( find "$directory" -type f \( $filter \) ) )

    # do other stuff with $output, such as trimming file names to
    # just the basename with no extension
}

declare dir
declare -a types
declare -a result=()

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types
for f in "${result[@]}"; do echo "$f"; done

With spaces in file and directory names (but not in file suffixes)

If you have spaces in your file and directory names, things are a bit more difficult because you must assign your array such that names are not split in words; one possibility to do this is to use \0 as file names separator, instead of a space, thanks to the -print0 option of find and the -d $'\0' option of read:

#!/usr/bin/env bash

getMatchingFiles() {
    local -n output=$1
    local -n directory=$2
    local -n _types_=$3
    local filter

    filter="${_types_[@]/#/ -o -name *.}"
    filter="${filter# -o }"
    while read -d $'\0' file; do
        output+=( "$file" )
    done < <( find "$directory" -type f \( $filter \) -print0 )

    # do other stuff with $output, such as trimming file names to
    # just the basename with no extension
}

declare dir
declare -a types
declare -a result=()

dir=/path/to/folder
types=(ogg mp3)
getMatchingFiles result dir types[@]
for f in "${result[@]}"; do echo "$f"; done

With spaces in file and directory names, even in file suffixes

Well, you deserve what happens to you... Still possible but left as an exercise.

Upvotes: 0

Related Questions