Robin wong
Robin wong

Reputation: 29

Is there a more efficient way to re-write multi if-else statement

I want to ask is there a more efficient and proper way to re-write my following code (simply because there are a lot of if-else statements)

the key_list in the following is a list of lists that each contains some DNA bases, e.g. key_list = [['-'],['A'],['A','T'],['C','G','T']] and '-' is used to present a special gap.

I got an idea is to use a dictionary to build a mapping relationship between these statements, but not so sure about the correctness.

output = []
for l in key_list:
    if len(l) == 1:
        output.append(l[0])
    elif len(l) == 2:
        if set(l) == set(['A', 'G']):
            output.append('R')
        elif set(l) == set(['C', 'T']):
            output.append('Y')
        elif set(l) == set(['A', 'C']):
            output.append('M')
        elif set(l) == set(['G', 'T']):
            output.append('K')
        elif set(l) == set(['G', 'C']):
            output.append('S')
        elif set(l) == set(['A', 'T']):
            output.append('W')
        else:
            print('Error!')
    elif len(l) == 3:
        if set(l) == set(['A', 'T', 'C']):
            output.append('H')
        elif set(l) == set(['G', 'T', 'C']):
            output.append('B')
        elif set(l) == set(['A', 'G', 'C']):
            output.append('V')
        elif set(l) == set(['A', 'T', 'G']):
            output.append('D')
        else:
            print('Error!')
    elif len(l) == 4:
        output.append('N')
    else:
        output.append('-')  # if there is only '-' in the column, also add it.

Upvotes: 1

Views: 69

Answers (1)

Jean-François Fabre
Jean-François Fabre

Reputation: 140276

you could use tuple(sorted(set(l))) to create a key for a dictionary:

elif len(l) == 2 or len(l) == 3:
    key = tuple(sorted(set(l)))
    output.append(lookup_dict[key])

where lookup_dict is something like:

lookup_dict = {('A', 'G') : 'R',
      ('C', 'T') : 'Y',
      ('A', 'C'): 'M',
      ('A', 'C', 'T') : 'H',   # note that it's A,C,T, not A,T,C, sort order!

 }

... and so on (merging both cases of length 2 and 3)

notes:

  • tuples are sorted alphabetically or tuple(sorted(set(l))) wouldn't match. tuple conversion is needed so keys are hashable (list won't do)
  • the lookup complexity has dropped from O(n) with your method (plus the useless & multiple set creation) to O(1) thanks to the dictionary.
  • the code does not handle the "error" case. If there isn't a match, you'll get a KeyError, probably better than print('Error!'). If you want to test first use key in lookup_dict condition.

As suggested in comments, frozenset can also be used as a dictionary key. In that case, the code is simpler:

elif len(l) == 2 or len(l) == 3:
    key = frozenset(l)
    output.append(lookup_dict[key])

and lookup_dict needs some more pre-processing to convert keys as frozenset type (but doesn't require key elements to be sorted, which is less error-prone):

lookup_dict = {frozenset(k):v for k,v in lookup_dict.items()}

But after this, the solution is probably slightly faster.

Upvotes: 2

Related Questions