Reputation: 1124
I'm looking for a way to find if any directory from the current directory onward has any duplicate directories, recursively.
i.e.
/user/guy/textfile1.txt
/user/guy/textfile2.txt
/user/guy/textfile3.txt
/user/girl/textfile1.txt
/user/girl/textfile2.txt
/user/girl/textfile3.txt
/user/fella/textfile1.txt
/user/fella/textfile2.txt
/user/fella/textfile3.txt
/user/fella/textfile4.txt
/user/rudiger/rudy/textfile1.txt
/user/rudiger/rudy/textfile2.txt
/user/rudiger/rudy/textfile3.txt
/user/julian/rudy/textfile1.txt
/user/julian/rudy/textfile2.txt
/user/julian/rudy/textfile3.txt
/girl and /guy /rudy would be duplicate directories and so would /julian and rudiger. We would also be checking if any other file contains the same files/dirs as "user". As we are running the script from "user" as the current directory we want to check the current directory as well for any duplicates down the line.
My current code which works... but it's non recursive which is an issue.
for d in */ ; do
for d2 in */ ; do
if [ "$d" != "$d2" ] ; then
string1="$(ls "$d2")"
string2="$(ls "$d")"
if [ "$string1" == "$string2" ] ; then
echo "The directories $d and $d2 are the same"
fi
fi
done
done
Upvotes: 0
Views: 71
Reputation: 295373
#!/usr/bin/env bash
# ^^^^- must be bash, not /bin/sh, and version 4.0 or newer.
# associative array mapping hash to first directory seen w/ same
declare -A hashes=( )
# sha256sum requiring only openssl, vs GNU coreutils
sha256sum() { openssl dgst -sha256 -r | sed -e 's@[[:space:]].*@@'; }
while IFS= read -r -d '' dirname; do
hash=$(cd "$dirname" && printf '%s\0' * | sha256sum)
if [[ ${hashes[$hash]} ]]; then
echo "COLLISION: Directory $dirname has same filenames as ${hashes[$hash]}"
else
hashes[$hash]=$dirname
fi
done < <(find . -type d -print0)
Upvotes: 2