Reputation: 1117
I have a folder with a couple of files that I need to organize/manipulate depending on if they both exist, or only one of them exists.
In my folder called folder1/checkthese/*.bam
the files are:
file1_aln.bam
file1_aln_sorted.bam
I have a script that checks if I have the unsorted file (which is just *_aln.bam
) and sorted file (*_aln_sorted.bam
) but I am having trouble getting my script to run correctly depending on if they both exist or not.
Here is my mini script:
for files in folder1/checkthese/*.bam
do
if [[ ${files} =~ "_aln.bam" ]] && [[ ${files} =~ "_aln_sorted.bam" ]]
then
echo "both files exist, need to delete unsorted file only"
echo "REMOVE $(basename ${files/_aln*}_aln.bam)"
rm -f ${files/_aln*}_aln.bam
elif [[ ${files} =~ "_aln_sorted.bam" ]] && [[ ! ${files} =~ "_aln.bam" ]]
then
echo "Only sorted file exists, all good"
fi
done
But this is the output I get:
Only sorted file exists, all good.
But clearly the unsorted file exists so for some reason it is skipping the first part of my loop and not removing the _aln.bam
file. I am not sure how to change my conditional statement in my elif statement so that if ONLY the _aln_sorted.bam
file exists, then all is good and I don't need to delete anything.
I think I should not be using the &&
for my elif
statement, but I thought the !
essentially is the NOT
boolean for this.
Upvotes: 0
Views: 502
Reputation: 2724
I will present a little less conventional solution, stressing two points:
First create some test files
mkdir data
seq 1 5 | xargs -I{} touch 'data/file_{}_aln.bam'
# first three of them have their sorted equivalents
seq 1 3 | xargs -I{} touch 'data/file_{}_aln_sorted.bam'
First let's check what files I'd delete:
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -d
The complement are the files I have to sort yet:
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -u
After checking, I can do something like this to delete the files
find data -name '*.bam' | sort | sed 's/_sorted//' | uniq -d | xargs rm
The final check if all unsorted are gone can be done easily by
ls data/*_aln.bam
# or to get some numeric results:
ls data/*_aln.bam | wc -l
Of course the usual caveats apply - use sensible file names or you have to use find -print0 | xargs -0
and deal with the consequences.
Upvotes: 0
Reputation: 151
Dude, your comparision can't do what you want.
Your first comparision is checking for the files that name contains both _aln.bam and _aln_sorted.bam string! And the second is checking for the files that name contains _aln_sorted.bam and doesn't contain _aln.bam!
So these comparions works on same file in every execution!
You need this:
#!/bin/bash
for file in /full_path/folder1/checkthese/*.bam
do
if [[ ${file} =~ "_aln.bam" ]]
then
echo "Unsorted file was found! It will be removed!"
echo "Removing the file named ${file}"
rm -f ${file}
echo "File removed!"
elif [[ ${file} =~ "_aln_sorted.bam" ]]
then
echo "${file} is a sorted file!"
fi
done
Upvotes: 1
Reputation: 1117
-----------EDIT--------------------
Okay I fixed my original script which did not use booleans to check for strings in the filename but instead checked if files existed. This worked for me:
Originally I had this script as well but ran into similar problems:
for files in folder1/checkthese/*.bam
do
if [ -f ${files/_aln*}_aln.bam ] && [ -f ${files/_aln*}_aln_sorted.bam ]
then
echo "both files exist, need to delete unsorted file only"
echo "REMOVE $(basename ${files/_aln*}_aln.bam)"
rm -f ${files/_aln*}_aln.bam
elif [ -f ${files/_aln*}_aln_sorted.bam ] && [ ! -f ${files/_aln*}_aln_sorted.bam ]
then
echo "Only sorted file exists, all good"
fi
done
Output works now.
Upvotes: 0