Conor
Conor

Reputation: 1

Append output of Find command to Variable in Bash Script

Trying to append output of find command to a variable in a Bash script

Can append output of find command to a log file ok, but can't append it to a variable i.e.

This works ok:

find $DIR -type d -name "*" >> $DIRS_REMOVED_LOG

But this won't:

FILES_TO_EVAL=find $DIR -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \)

ENV=`basename $PS_CFG_HOME | tr "[:lower:]" "[:upper:]"`

FILE_TYPES=(*.log *.xml *.txt *.sh)
DIRS_TO_CLEAR="$PS_CFG_HOME/data/files   $PS_CFG_HOME/appserv/prcs/$ENV/files   $PS_CFG_HOME/appserv/prcs/$ENV/files/CQ"

FILES_REMOVED_LOG=$PS_CFG_HOME/files_removed.log
DIRS_REMOVED_LOG=$PS_CFG_HOME/dirs_removed.log

##Cycle through directories
##Below for files_removed_log works ok but can't get the find into a variable.
for DIR in `echo $DIRS_TO_CLEAR`
do
        echo "Searching $DIR for files:"
        FILES_TO_EVAL=find $DIR -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \)

        find $DIR -type d -name "*" >> $DIRS_REMOVED_LOG
done

Expected FILES_TO_EVAL to be populated with results of find command but it is empty.

Upvotes: 0

Views: 2187

Answers (2)

Gordon Davisson
Gordon Davisson

Reputation: 125788

In addition to the problems that shellcheck.net will point out, there are a number of subtler problems.

For one thing, you're using all-caps variable names. This is dangerous, because there are a large number of all-caps variables that have special meanings to the shell and/or other tools, and if you accidentally use one of those, it can have weird effects. Lower- or mixed-case variables are much safer (except when you specifically want the special meaning).

Also, you should almost always put double-quotes around variable references (e.g. find "$dir" ... instead of find $dir ...). Without them, the variables will be subject to word splitting and wildcard expansion, which can have a variety of unintended consequences. In some cases, you need word splitting and/or wildcard expansion on a variable's value, but usually not quite the way the shell does it; in these cases, you should look for a better way to do the job.

In the line that's failing,

FILES_TO_EVAL=find $DIR -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \)

the immediate problem is that you need to use $(find ...) to capture the output from the find command. But this is still dangerous, because it's just storing a newline-delimited list of file paths, and the standard way to expand this (just using an unquoted variable reference) has all the problems I mentioned above. In this case, it will lead to trouble if any filenames contain spaces or wildcards (which are perfectly legal in filenames). In you're in a controlled environment where you can guarantee this won't happen, you'll get away with it... but it's really not the best idea.

Correctly handling a list of filepaths from find is a little complicated, but there are a number of ways to do it. There's a lot of good info in BashFAQ #20: "How can I find and safely handle file names containing newlines, spaces or both?" I'll summarize some common options below:

If you don't need to store the list, just run commands on individual files, you can use find -exec:

find "$dir" -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \) -exec somecommand {} \;

If you need to run something more complex, you can use find -print0 to output the list in an unambiguous form, and then use read -d '' to read them. There are a bunch of potential pitfalls here, so here's the version I use to avoid all the trouble spots:

while IFS= read -r -d '' filepath <&3; do
    dosomethingwith "$filepath"
done 3< <(find "$dir" -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \) -print0)

Note that the <(command) syntax (known as process substitution) is a bash-only feature, so use an explicit bash shebang (#!/bin/bash or #!/usr/bin/env bash) on your script, and don't override it by running the script with sh.

If you really do need to store the list of paths for later, store it as an array:

files_to_eval=()
while IFS= read -r -d '' filepath; do
    files_to_eval+=("$filepath")
done < <(find "$dir" -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \) -print0)

..or, if you have bash v4.4 or later, it's easier to use readarray (aka mapfile):

readarray -td '' files_to_eval < <(find "$dir" -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \) -print0)

In either case, you should then expand the array with "${files_to_eval[@]}" to get all the elements without subjecting them to word splitting and wildcard expansion.

On to some other problems. In this line:

FILE_TYPES=(*.log *.xml *.txt *.sh)

In this context, the wildcards will be expanded immediately to a list of matches in the current director. You should quote them to prevent this:

file_types=("*.log" "*.xml" "*.txt" "*.sh")

In these lines:

DIRS_TO_CLEAR="$PS_CFG_HOME/data/files   $PS_CFG_HOME/appserv/prcs/$ENV/files   $PS_CFG_HOME/appserv/prcs/$ENV/files/CQ"
...
for DIR in `echo $DIRS_TO_CLEAR`

You're storing a list as a single string with entries separated by spaces, which has all the word-split and wildcard problems I've been harping on. Also, the echo here is a complication that doesn't do anything useful, and actually makes the wildcard problem worse. Use an array, and avoid all the mess:

dirs_to_clear=("$ps_cfg_home/data/files" "$ps_cfg_home/appserv/prcs/$env/files" "$ps_cfg_home/appserv/prcs/$env/files/CQ")
...
for dir in "${dirs_to_clear[@]}"

Upvotes: 1

John Kugelman
John Kugelman

Reputation: 361605

Run your scripts through ShellCheck. It finds lots of common mistakes, much like a compiler would.

FILES_TO_EVAL=find $DIR -type f \( -name '*.sh' -or -name '*.txt' -or -name '*.xml' -or -name '*.log' \)

SC2209: Use var=$(command) to assign output (or quote to assign string).

Upvotes: 3

Related Questions