Yuri Burkov
Yuri Burkov

Reputation: 141

How to check if two files have the same exact name in Bash

I have a code that attempts to check if files in two different folders have the same name, and if they don't have the same name it removes the file from the second directory that are not matching.

But my code fails to work.

#!/bin/bash

mn_dir=$1
en_dir=$2

for mn_file in $mn_dir/*;
do
    for en_file in $en_dir/*;
    do
        if [[ "$mn_file"=="$en_file" ]]; then
            echo "$mn_file AND $en_file"
        else
            rm $en_file
        fi
    done
done

And the output is the following where it said mn/1.txt AND en/10.txt where 1.txt and 10.txt doesn't have the same name. How do I check if two files have the same name?

mn/1.txt AND en/1.txt
mn/1.txt AND en/10.txt
mn/1.txt AND en/2.txt
mn/1.txt AND en/3.txt
mn/1.txt AND en/4.txt
mn/1.txt AND en/5.txt
mn/1.txt AND en/6.txt
mn/1.txt AND en/7.txt
mn/1.txt AND en/8.txt
mn/1.txt AND en/9.txt
mn/2.txt AND en/1.txt
mn/2.txt AND en/10.txt
mn/2.txt AND en/2.txt
mn/2.txt AND en/3.txt
mn/2.txt AND en/4.txt
mn/2.txt AND en/5.txt
mn/2.txt AND en/6.txt
mn/2.txt AND en/7.txt
mn/2.txt AND en/8.txt
mn/2.txt AND en/9.txt
mn/3.txt AND en/1.txt
mn/3.txt AND en/10.txt
mn/3.txt AND en/2.txt
mn/3.txt AND en/3.txt
mn/3.txt AND en/4.txt
mn/3.txt AND en/5.txt
mn/3.txt AND en/6.txt
mn/3.txt AND en/7.txt
mn/3.txt AND en/8.txt
mn/3.txt AND en/9.txt
mn/4.txt AND en/1.txt
mn/4.txt AND en/10.txt
mn/4.txt AND en/2.txt
mn/4.txt AND en/3.txt
mn/4.txt AND en/4.txt
mn/4.txt AND en/5.txt
mn/4.txt AND en/6.txt
mn/4.txt AND en/7.txt
mn/4.txt AND en/8.txt
mn/4.txt AND en/9.txt
mn/5.txt AND en/1.txt
mn/5.txt AND en/10.txt
mn/5.txt AND en/2.txt
mn/5.txt AND en/3.txt
mn/5.txt AND en/4.txt
mn/5.txt AND en/5.txt
mn/5.txt AND en/6.txt
mn/5.txt AND en/7.txt
mn/5.txt AND en/8.txt
mn/5.txt AND en/9.txt

Upvotes: 0

Views: 2668

Answers (3)

Léa Gris
Léa Gris

Reputation: 19545

A solution that streams, sort and uniq files list to determine files to delete:

#!/usr/bin/env bash

mn_dir="$1"
en_dir="$2"

if ! [ -d "$mn_dir" ] || ! [ -d "$en_dir" ]; then exit 2; fi

# Get basename of files into arrays for each directory
declare -a mn_files=("$mn_dir/"*)
for i in "${!mn_files[@]}"; do mn_files[$i]="${mn_files[$i]##*/}"; done
declare -a en_files=("$en_dir/"*)
for i in "${!en_files[@]}"; do en_files[$i]="${en_files[$i]##*/}"; done

cd "$en_dir" || exit

# Pipe a combined null delimited list of both files from mn and en directories
# into sorting
# into unique
# and make these arguments to delete files from the en directory
printf '%s\0' "${mn_files[@]}" "${en_files[@]}" \
  | sort -z \
  | uniq -zu \
  | xargs -0 rm -f --

cd - >/dev/null || exit

Upvotes: 1

M. Nejat Aydin
M. Nejat Aydin

Reputation: 10123

Assuming filenames in both directories don't contain a newline character, a one-liner in bash:

comm -1 -3 <(cd "$mn_dir" && printf "%s\n" *) <(cd "$en_dir" && printf "%s\n" *)

which lists the files existing in the directory represented by the variable en_dir but missing in the directory represented by the variable mn_dir.

Upvotes: 0

krayon
krayon

Reputation: 96

TL;DR

Addressing the actual question, see DETAILS for another option maybe closer to the actual objective.

This will delete any files in en_dir that are not in mn_dir but not files in mn_dir that are not in en_dir:

#!/bin/bash

if [ $# -ne 2 ]; then
  echo "Usage: $0 <mn_dir> <en_dir>"
  exit 1
fi

mn_dir=$1
en_dir=$2

for en_file in "$en_dir"/*;
do
    mn_file="$mn_dir/${en_file##*/}"
    if [ -e "$mn_file" ]; then
        echo "$mn_file AND $en_file"
    else
        rm "$en_file"
    fi
done

For both ways, either:

./script.bash dir1 dir2
./script.bash dir2 dir1

or do something like:

#!/bin/bash

if [ $# -ne 2 ]; then
  echo "Usage: $0 <dir1> <dir2>"
  exit 1
fi

del_dest_extra_files() {
    if [ $# -ne 2 ]; then
        return 1
    fi

    mn_dir=$1
    en_dir=$2

    for en_file in "$en_dir"/*;
    do
        mn_file="$mn_dir/${en_file##*/}"
        if [ -e "$mn_file" ]; then
            echo "$mn_file AND $en_file"
        else
            rm "$en_file"
        fi
    done
}

del_dest_extra_files "$1" "$2"
del_dest_extra_files "$2" "$1"

NOTE 1: This will not verify that the files are actually the same, content wise, or even type wise (ie. a directory and file with the same name will still match)

NOTE 2: I have quoted the variables for filename safety


DETAILS

As @jeremysprofile suggests you could use basename(1). Additionally you need spaces around the ==, eg:

if [[ "$(basename $mn_file)" == "$(basename $en_file)" ]]

However, the basename subshells are expensive and if this is for bash only, and doesn't need to be POSIX, you can use bash's Remove matching prefix pattern:

if [[ "${mn_file##*/}" == "${en_file##*/}" ]]

Lastly, the multiple file loops are not necessary if you use a file exists test (-e) so you can:

#!/bin/bash

mn_dir=$1
en_dir=$2

for mn_file in "$mn_dir"/*;
do
    en_file="$en_dir/${mn_file##*/}"
    if [ -e "$en_file" ]; then
        echo "$mn_file AND $en_file"
    else
        rm "$en_file"
    fi
done

UPDATE: Given that you are trying to delete destination directory files that are not in the source, we need to flip which directory we're looping over:

#!/bin/bash

mn_dir=$1
en_dir=$2

for en_file in "$en_dir"/*;
do
    mn_file="$mn_dir/${en_file##*/}"
    if [ -e "$mn_file" ]; then
        echo "$mn_file AND $en_file"
    else
        rm "$en_file"
    fi
done

DANGER: Running this script with only one parameter would potentially delete files in your root directory (if you had any). You should ensure the parameters aren't NULL:

#!/bin/bash

if [ $# -ne 2 ]; then
  echo "Usage: $0 <mn_dir> <en_dir>"
  exit 1
fi

mn_dir=$1
en_dir=$2

for en_file in "$en_dir"/*;
do
    mn_file="$mn_dir/${en_file##*/}"
    if [ -e "$mn_file" ]; then
        echo "$mn_file AND $en_file"
    else
        rm "$en_file"
    fi
done

Another way to address the objective that's probably easier, especially if you want to take contents into account, is to utilise diff to do the comparison:

#!/bin/bash

if [ $# -ne 2 ]; then
    echo "Usage: $0 <dir1> <dir2>"
    exit 1
fi

while read -r f; do
    rm "${f}"
done < <(
    diff -qr "${1}" "${2}"\
    |sed -n 's#^Only in \(.*\): \(.*\)$#\1/\2#p'
)

This will delete all files from dir1 not in dir2 and all files in dir2 not in dir1 UNLESS they don't contain the same data.

Upvotes: 2

Related Questions