Reputation: 102
If you multiply an inequality by a negative number you must reverse the direction of the inequality.
For example:
if x = 6, it is consistent with equation (1) and (2).
Is there a way to multiply an inequality statement by an integer in a one-liner for Python to reverse the signs?
From the practical point of view, I am trying to extract DNA/protein sequences from TBLASTN results. There are strands +1 and -1 and the operation after that condition statement is the same.
# one-liner statement I would like to implement
if (start_codon <= coord <= stop_codon)*strand:
# do operation
# two-liner statement I know would work
if (start_codon <= coord <= stop_codon) and strand==1:
# do operation
elif (start_codon >= coord >= stop_codon) and strand==-1:
# do operation
Upvotes: 3
Views: 977
Reputation: 73450
You could select the lower and upper bounds based on the strand
value. This assumes that strand
is always either 1
or -1
and makes use of bool
being an int
subclass in Python so that True
and False
can be used to index into pairs:
cdns = (start_codon, stop_codon)
if (cdns[strand==-1] <= coord <= cdns[strand==1]):
# Python type coercion (True -> 1, False -> 0) in contexts requiring integers
Upvotes: 5
Reputation: 476574
You write it like:
if (start_codon <= coord <= stop_codon) and strand==1:
# do operation
elif (start_codon >= coord >= stop_codon) and strand==-1:
# do operation
But this is equivalent to:
if abs(strand) == 1 and strand * start_codon <= strand * coord <= strand * stop_codon:
# do operation
pass
Or in case we can make the assumption that abs(strand) == 1
always holds:
if strand * start_codon <= strand * coord <= strand * stop_codon:
# do operation
pass
This works since x >= y
, is equilvalent to -x <= -y
. So instead of "reversing" the condition, we multiply both operands with -1
, and thus implicitly reverse the condition. Let us take the example:
In case strand == 1
, then we thus evaluate -start_codon <= -coord <= -stop_codon
. This is equivalent to -start_codon <= -coord and -coord <= -stop_codon
. Now we can normalize the two subexpressions with start_codon >= coord and coord >= stop_codon
which is equivalent to start_codon >= coord coord >= stop_codon
. So this means that -start_codon <= -coord <= -stop_codon
is equivalent to start_codon >= coord >= stop_codon
.
We make a single assumption: that is that start_codon
and stop_codon
are numbers (such that we can multiply these).
We can generate empirical evidence for this relation emprically as well by the following setup:
import numpy as np
test_size = 1000
a = np.random.randn(test_size, 4) # generate 1000x4 matrix of random data
a[:,3] = np.sign(a[:,3]) # make the last column -1 and 1
assert not (a[:,3] == 0).any() # check no zeros in the last column
b = np.zeros(test_size) # results for our relation
c = np.zeros(test_size) # results for the question implementation
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
b[i] = strand * start_codon <= strand * coord <= strand * stop_codon
if (start_codon <= coord <= stop_codon) and strand==1:
c[i] = 1
elif (start_codon >= coord >= stop_codon) and strand==-1:
c[i] = 1
else:
c[i] = 0
assert (b == c).all()
If we take the above code, we can slighly modify it to check performance as follows:
import numpy as np
from operator import ge, le
test_size = 100000
a = np.random.randn(test_size, 4) # generate 1000x4 matrix of random data
a[:,3] = np.sign(a[:,3]) # make the last column -1 and 1
assert not (a[:,3] == 0).any() # check no zeros in the last column
b = np.zeros(test_size) # target array
def code_kevin():
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
if (start_codon <= coord <= stop_codon) and strand==1:
b[i] = 1
elif (start_codon >= coord >= stop_codon) and strand==-1:
b[i] = 1
def code_schwo():
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
cdns = (start_codon, stop_codon)
if (cdns[strand==-1] <= coord <= cdns[strand==1]):
b[i] = 1
def code_moinu():
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
codon_check = (le, ge)[strand==1]
if codon_check(start_codon, coord) and codon_check(coord, stop_codon):
b[i] = 1
def code_wille():
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
if strand * start_codon <= strand * coord <= strand * stop_codon:
b[i] = 1
def code_fabio():
for i, (start_codon, coord, stop_codon, strand) in enumerate(a):
if pow(start_codon/coord, strand) <= 1 <= pow(stop_codon/coord, strand):
b[i] = 1
so as operations, we use b[i] = 1
. We furthermore always use if
statements (and not assign a boolean directly) to make the comparison timings more fair.
We can then use timeit.timeit
, to run every function a number of times, and time the number of seconds it takes:
>>> timeit.timeit(code_kevin, number=100)
8.667507636011578
>>> timeit.timeit(code_schwo, number=100)
10.56048975896556
>>> timeit.timeit(code_wille, number=100)
8.908266504004132
>>> timeit.timeit(code_fabio, number=100)
13.454442486981861
>>> timeit.timeit(code_moinu, number=100)
10.350756354047917
Upvotes: 4
Reputation: 48067
Functional approach using ge
and le
from the operator
library:
from operator import ge, le
# Sets `codon_check` function as:
# - `ge` if "strand == 1"
# - `le` otherwise
codon_check = (le, ge)[strand==1]
if codon_check(start_codon, coord) and codon_check(coord, stop_codon):
# do operation
Here:
Upvotes: 1
Reputation: 8360
You can use simply math:
# one-liner statement
if pow(start_codon/coord, strand) <= 1 <= pow(stop_codon/coord, strand):
# do operation
# two-liner statement I know would work
if (start_codon <= coord <= stop_codon) and strand==1:
# do operation
elif (start_codon >= coord >= stop_codon) and strand==-1:
# do operation
since x^-1 makes the inversion of the ratios de facto inverting the comparison directions
Upvotes: 1