Reputation: 73
I want to run a script that searches if each snp (list of snps are contained in the variable $snplist) is in all GWAS cohorts (all cohorts are in separate files ending with *renamed_snp_search.txt). If a snp is in all the cohorts then the snp goes into a log file and I want the loop to terminate after 10 snps are found. I thought that redefining the $total_snp variable toward the end of the while loop would help with this but it appears that the loop just keeps going after using sample data.
touch snp_search.log
total_snp=$(cat snp_search.log | wc -l)
files=(*renamed_snp_search.txt)
count_files=${#files[@]}
while [ "$total_snp" -lt 10 ] ; do
for snp in $snplist ; do
count=$(grep -wl "${snp}" *snp_search.txt | wc -l)
if ((count == count_files)) ; then
echo "$snp was found in all $count_files files" >> ${date}_snp_search.log
total_snp=$(cat snp_search.log | wc -l)
fi
done
done
Upvotes: 1
Views: 575
Reputation: 125708
You're misunderstanding the logical structure of the two loops you have; the while [ "$total_snp" -lt 10 ]
loop and the for snp in $snplist
loop. The condition on the while
loop is only tested at the beginning of each time through that loop, so it will not interrupt the for
loop if the condition is met partway through that loop.
Essentially, the execution process is like this:
$total_snp
is less than 10; it is, so run the while
loop's contents:for
loop, search files for each item in $snplist
$total_snp
is less than 10; if it is run the while
loop's contents again, otherwise exit the loop....so if there are 10 or more snps that're found in all files, it won't notice that it's found enough until it's run through the entire snp list.
(On the other hand, suppose there were only 7 snps that were found in all files. In that case, it'd search for all snps, find the 7 matches, check to see whether it'd found 10 yet, and since it hadn't it'd run the for
loop again and find and log the same 7 matches again. After which $total_snp
would be 14, so it would finally exit the while
loop.)
What you want to do instead is to break out of the for
loop if $total_snp
reaches 10 as that loop runs. So remove the while
loop, and add a break
condition inside the for
loop:
for snp in $snplist ; do
count=$(grep -wl "${snp}" *snp_search.txt | wc -l)
if ((count == count_files)) ; then
echo "$snp was found in all $count_files files" >> ${date}_snp_search.log
total_snp=$(cat snp_search.log | wc -l)
if [ "$total_snp" -ge 10 ]; then
break # Break out of the `for` loop, we found enough
fi
fi
done
Upvotes: 1