Reputation: 21
So I have a conversion program (works through command line) that takes a data file and converts it into a new format while also putting it in a folder with various subfolders. I wanted to create a script that would check for duplicates before executing this conversion program.
So far, I have
#!/bin/bash
for subj in `ls <directory to data files>`
do
subj_name=$subj
subj_path=<directory to data files>/$subj_name #I need this for my program, can ignore
cd <directory with output folders>
if [ -e “$subj” ]; then
echo “This file already exists” #This will restart the loop and move to the next file
else
echo “This folder does not exist”
My_Program #I can handle this one
fi
done
The program works fine with files of the same format (ie .txt and .txt) but cannot check for a folder and .txt for the same name. Are there any changes I can make to check for the same name regardless of file format?
Edit: I did a little experimenting, and I put a duplicate data file into the directory with the output folders and it still didn't recognize it. I think the cd line or the if line is wrong then.. anyone have any tips on how I could fix this?
Upvotes: 0
Views: 204
Reputation: 814
Use the syntax bellow to remove ".txt" from the end of value of $subj, returning the resulting string . (more info on "Bash String Manipulation")
${subj%.txt}
Then check the existence of files/directories with or without .txt:
if [ -e "$subj" ] || [ -e "${subj%.txt}" ]; then
....
If you want to remove any suffix (.txt, .tgz, ...) use ${subj%.*}
to delete all characters after (and including) the last '.' Example:
[bash]$ subj=file.txt
[bash]$ echo ${subj%.*}
[bash]$ file
Or use ${subj%%.*}
to delete all characters after (and including) the first '.':
[bash]$ subj=file.txt.tgz
[bash]$ echo ${subj%%.*}
[bash]$ file
Upvotes: 3