craigmiller160
craigmiller160

Reputation: 6273

rm with an array of filenames

I'm working on making an advanced delete script. The idea is the user inputs a grep regex for what needs to be deleted, and the script does an rm operation for all of it. Basically eliminates the need to write all the code directly in the command line each time.

Here is my script so far:

#!/bin/bash
# Script to delete files passed to it

if [ $# -ne 1 ]; then
    echo "Error! Script needs to be run with a single argument that is the regex for the files to delete"
    exit 1
fi

IFS=$'\n'

files=$(ls -a | grep $1 | awk '{print "\"" $0 "\"" }')

## TODO ensure directory support

echo "This script will delete the following files:"
for f in $files; do
    echo "  $f"
done

valid=false

while ! $valid ; do
    read -p "Do you want to proceed? (y/n): "
    case $REPLY in
        y)
            valid=true
            echo "Deleting, please wait"
            echo $files
            rm ${files}
        ;;
        n)
            valid=true
        ;;
        *)
            echo "Invalid input, please try again"
        ;;
    esac
done

exit 0

My problem is when I actually do the "rm" operation. I keep getting errors saying No such file or directory.

This is the directory I'm working with:

drwxr-xr-x   6 user  staff   204 May  9 11:39 .
drwx------+ 51 user  staff  1734 May  9 09:38 ..
-rw-r--r--   1 user  staff    10 May  9 11:39 temp two.txt
-rw-r--r--   1 user  staff     6 May  9 11:38 temp1.txt
-rw-r--r--   1 user  staff     6 May  9 11:38 temp2.txt
-rw-r--r--   1 user  staff    10 May  9 11:38 temp3.txt

I'm calling the script like this:

easydelete.sh '^tem'

Here is the output:

This script will delete the following files:
  "temp two.txt"
  "temp1.txt"
  "temp2.txt"
  "temp3.txt"
Do you want to proceed? (y/n): y
Deleting, please wait
"temp two.txt" "temp1.txt" "temp2.txt" "temp3.txt"
rm: "temp two.txt": No such file or directory
rm: "temp1.txt": No such file or directory
rm: "temp2.txt": No such file or directory
rm: "temp3.txt": No such file or directory

If I try and directly delete one of these files, it works fine. If I even pass that whole string that prints out before I call "rm", it works fine. But when I do it with the array, it fails.

What am I doing wrong?

Upvotes: 0

Views: 2616

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295618

Consider instead:

# put all filenames containing $1 as literal text in an array
#files=( *"$1"* )

# ...or, use a grep with GNU extensions to filter contents into an array:
# this passes filenames around with NUL delimiters for safety
#files=( )
#while IFS= read -r -d '' f; do
#  files+=( "$f" )
#done < <(printf '%s\0' * | egrep --null --null-data -e "$1")

# ...or, evaluate all files against $1, as regex, and add them to the array if they match:
files=( )
for f in *; do
  [[ $f =~ $1 ]] && files+=( "$f" )
done

# check that the first entry in that array actually exists
[[ -e $files || -L $files ]] || {
  echo "No files containing $1 found; exiting" >&2
  exit 1
}

# warn the user
echo "This script will delete the following files:" >&2
printf '  %q\n' "${files[@]}" >&2

# prompt the user
valid=0
while (( ! valid )); do
  read -p "Do you want to proceed? (y/n): "
  case $REPLY in
    y) valid=1; echo "Deleting; please wait" >&2; rm -f "${files[@]}" ;;
    n) valid=1 ;;
  esac
done

I'll go into the details below:

  • files has to be explicitly created as an array to actually be an array -- otherwise, it's just a string with a bunch of files in it.

    This is an array:

    files=( "first file" "second file" )
    

    This is not an array (and, in fact, could be a single filename):

    files='"first file" "second file"'
    
  • A proper bash array is expanded with "${arrayname[@]}" to get all contents, or "$arrayname" to get only the first entry.

    [[ -e $files || -L $files ]]
    

    ...thus checks the existence (whether as a file or a symlink) of the first entry in the array -- which is sufficient to tell if the glob expression did in fact expand, or if it matched nothing.

  • A boolean is better represented with numeric values than a string containing true or false: Running if $valid has potential to perform arbitrary activity if the contents of valid could ever be set to a user-controlled value, whereas if (( valid )) -- checking whether $valid is a positive numeric value (true) or otherwise (false) -- has far less room for side effects in presence of bugs elsewhere.

  • There's no need to loop over array entries to print them in a list: printf "$format_string" "${array[@]}" will expand the format string additional times whenever it has more arguments (from the array expansion) than its format string requires. Moreover, using %q in your format string will quote nonprintable values, whitespace, newlines, &c. in a format that's consumable by both human readers and the shell -- whereas otherwise a file created with touch $'evil\n - hiding' will appear to be two list entries, whereas in fact it is only one.

Upvotes: 3

Related Questions