Transcendental Llama
Transcendental Llama

Reputation: 45

filename comparison with wildcard

I am working on a script and I need to compare a filename to another one and look for specific changes (in this case a "(x)" added to a filename when OS X needs to add a file to a directory, when a filename already exists) so this is an excerpt of the script, modified to be tested without the rest of it.

#!/bin/bash

p2_s2="/Path/to file (name)/containing - many.special chars.docx.gdoc"
next_line="/Path/to file (name)/containing - many.special chars.docx (1).gdoc"
file_ext=$(echo "${p2_s2}" | rev | cut -d '.' -f 1 | rev)
file_name=$(basename "${p2_s2}" ".${file_ext}")
file_dir=$(dirname "${p2_s2}")

esc_file_name=$(printf '%q' "${file_name}")
esc_file_dir=$(printf '%q' "${file_dir}")
esc_next_line=$(printf '%q' "${next_line}")

if [[ ${esc_next_line} =~ ${esc_file_dir}/${esc_file_name}\ \(?\).${file_ext} ]]
 then
  echo "It's a duplicate!"
fi

What I'm trying to do here is detect if the file next_line is a duplicate of p2_s2. As I am expecting multiple duplicates, next_line can have a (1) appended at the end of a filename or any other number in brackets (Although I am sure no double digits). As I can't do a simple string compare with a wildcard in the middle, I tried using the "=~" operator and escaping all the special chars. Any idea what I'm doing wrong?

Upvotes: 3

Views: 61

Answers (2)

Rany Albeg Wein
Rany Albeg Wein

Reputation: 3474

You can trim ps2_s2's extension, trim next_line's extension including the number inside the parenthesis and see if you get the same file name. If you do - it's a duplicate. In order to do so, [[ allows us to perform a comparison between a string and a Glob.

I used extglob's +( ... ) pattern, so I can use +([0-9]) to match the number inside the parenthesis. Notice that extglob is enabled by shopt -s extglob.

#!/bin/bash

p2_s2="/Path/to/ps2.docx.gdoc"
next_line="/Path/to/ps2(1).docx.gdoc"

shopt -s extglob
if [[ "${p2_s2%%.*}" = "${next_line%%\(+([0-9])\).*}" ]]; then
    printf '%s is a duplicate of %s\n' "$next_line" "$p2_s2"
fi

EDIT:

I now see that you've edited your question, so in case this solution is not enough, I'm positive that it'll be a good template to work with.

Upvotes: 1

Etan Reisner
Etan Reisner

Reputation: 80931

The (1) in next_line doesn't come before the final . it comes before the second to final . in the original filename but you only strip off a single . as the extension.

So when you generate the comparison filename you end up with /Path/to\ file\ \(name\)/containing\ -\ many.special\ chars.docx\ \(?\).gdoc which doesn't match what you expect.

If you had added set -x to the top of your script you'd have seen what the shell was actually doing and seen this.

What does OS X actually do in this situation? Does it add (#) before .gdoc? Does it add it before.docx`? Does it depend on whether OS X knows what the filename is (it is some type it can open natively)?

Upvotes: 0

Related Questions