Reputation:
I am trying to pipe a QuerySequences using bash to perform a blastn on a ReferenceGenome. The problem is that I have to many sequences and to avoid producing a file with to many unnecesary registers (memory issues), I want to limit my output to those results with a percent of identity of 90% or higher.
I have run the following script without the flag I am going to mention afterwards and it works perfectly.
#-perc_identity 80?????
blastall -p blastn -d ReferenceGenome -i QuerySequences -G 1 -E 2 -W 15 -F "m D" -U -e 1e-20 -m 8 -a 8 -o NAME.blast.out
But, when I try with -perc_identity 90 flag I get the following error.
[blastall] ERROR: Arguments must start with '-' (the offending argument #4 was: '90')
I have proved the flag in several position (after the blastn, after Querygenome, after "m D") and the only thing that changes is the number after the # in the Error.
Does anybody knows the probable reason of this error?
Thank you very much for your help.
Upvotes: 0
Views: 543
Reputation: 1027
You're trying to use a Blast+ option with the legacy program blastall. Try installing blast+ and running the blastn program
Upvotes: 1
Reputation: 31648
I don't think there is a parameter for minimum percent identity. There are options to limit the output results to X number of hits.
Percent identity isn't even that useful because it doesn't take into account the length of the hit. e-value
which takes the identity AND the length into account to give you the probability that the alignment occurred by chance.
Usually, I think what people usually do is get the hits back and take the best X hits based on e-value
. Remember, the lower the e-value
the better.
Upvotes: 0