Prashant Kumar
Prashant Kumar

Reputation: 22509

Count and remove old files using Unix find

I want to delete files in $DIR_TO_CLEAN older than $DAYS_TO_SAVE days. Easy:

find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec rm {} \;

I suppose we could add a -type f or a -f flag for rm, but I really would like to count the number of files getting deleted.

We could do this naively:

DELETE_COUNT=`find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE | wc -l`
find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec rm {} \;

But this solution leaves a lot to be desired. Besides the command duplication, this snippet overestimates the count if rm failed to delete a file.

I'm decently comfortable with redirection, pipes (including named ones), subshells, xargs, tee, etc, but I am eager to learn new tricks. I would like a solution that works on both bash and ksh.

How would you count the number of files deleted by find?

Upvotes: 2

Views: 6216

Answers (2)

Thor
Thor

Reputation: 47089

I would avoid -exec and go for a piped solution:

find "$DIR_TO_CLEAN" -type f -mtime +$DAYS_TO_SAVE -print0 \
| awk -v RS='\0' -v ORS='\0' '{ print } END { print NR }'  \
| xargs -0 rm

Using awk to count matches and pass them on to rm.

Update:

kojiro made me aware that the above solution does not count the success/fail rate of rm. As awk has issues with badly named files I think the following bash solution might be better:

find "${DIR_TO_CLEAN?}" -type f -mtime +${DAYS_TO_SAVE?} -print0 |
(
  success=0 fail=0
  while read -rd $'\0' file; do 
  if rm "$file" 2> /dev/null; then 
    (( success++ ))
  else
    (( fail++ ))
  fi
  done
  echo $success $fail
)

Upvotes: 5

kojiro
kojiro

Reputation: 77059

You could just use bash within find:

find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec bash -c 'printf "Total: %d\n" $#; rm "$@"' _ {} +

Of course this can call bash -c … more than once if the number of files found is larger than MAX_ARGS, and it also can overestimate the count if rm fails. But solving those problems gets messy:

find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec bash -c 'printf "count=0; for f; do rm "$f" && (( count++ )); done; printf "Total: %d\n" $count' _ {} +

This solution to avoid MAX_ARGS limits avoids find altogether. If you need it to be recursive, you'll have to use recursive globbing, which is only available in newer shells. (globstar is a bash 4 feature.)

shopt -s globstar
# Assume DAYS_TO_SAVE reformatted to how touch -m expects it. (Exercise for the reader.)
touch -m "$DAYS_TO_SAVE" referencefile
count=0
for file in "$DIR_TO_CLEAN/"**/*; do
    if [[ referencefile -nt "$file" ]]; then
        rm "$file" && (( count++ ))
    fi
done
printf 'Total: %d\n' "$count"

Here's an approach using find with printf (strictly compliant find doesn't have printf, but you can use printf as a standalone utility in that case).

find "$DIR_TO_CLEAN" -type -f -mtime "+$DAYS_TO_SAVE" -exec rm {} \; -printf '.' | wc -c
find "$DIR_TO_CLEAN" -type -f -mtime "+$DAYS_TO_SAVE" -exec rm {} \; -exec printf '.' \; | wc -c

Upvotes: 1

Related Questions