truffle
truffle

Reputation: 565

my comparing of strings in python not working

I am beginner and I am writing a simple Python program of converting RNA sequence to protein codons. So I have an RNA string and I take every 3 letters (such as UUU) and convert it to a protein codon that it is associated with. So "UUU" converts to "F", "UUA" converts to "L" and so on. Here is my code:

my_rna="UUAUUGUUUUUC"
my_protein=""
for i in xrange(0,len(my_rna),3):
    sub_str = my_rna[i:(i+3)]
    print sub_str
    if sub_str=="UUU" or "UUC":
        my_protein +="F"
    elif sub_str=="UUA" or "UUG":
        my_protein +="L"
    else:
        print "No match"
    print i

print my_protein

this is the output:

UUA
0
UUG
3
UUU
6
UUC
9
FFFF

So the problem seems to be that every substring of the 3 RNA letters seems to be true for the first IF statement and so all the RNA 3 letter sets translate to "F" even though the output should be "AAFF"?

Could someone tell me why this is and how can I fix it? I read around and there is a difference between using == and is when comparing strings in Python because they mean different things in the condition, however I don't think this is the problem because when I tried replacing my == with is, the same output happens.

Thanks!

Upvotes: 0

Views: 109

Answers (2)

NPE
NPE

Reputation: 500933

The

if sub_str=="UUU" or "UUC":

should be written as

if sub_str=="UUU" or sub_str=="UUC":

or

if sub_str in ("UUU", "UUC"):

In the latter, the parentheses can be replaced with square brackets or curly braces. Since there's no practical difference in your case, I won't go into that further.

What you have right now is valid syntactically but doesn't do what you expect it to do. It is equivalent to:

if (sub_str=="UUU") or bool("UUC"):

and always evaluates to True (since bool("UUC") is True).

(All of this applies to the elif comparison too.)

Upvotes: 4

Michael0x2a
Michael0x2a

Reputation: 64358

The problem is this:

if sub_str=="UUU" or "UUC":
    ...

A common mistake many beginners make is to assume this means 'proceed if sub_str is equal to EITHER "UUU" OR "UUC"'. In reality, this actually means 'proceed if sub_str is equal to "UUU" OR if "UUC" evaluates to a truthy value".

Since "UUC" is treated as 'True', your first 'if' statement always executes.

You can fix this by either doing:

if sub_str == "UUU" or sub_str == "UUC":

...or by doing:

if sub_str in ("UUU", "UUC"):

...which is a nice shortcut.

Upvotes: 0

Related Questions