Reputation: 783
Hello I have a list of tuple such as :
indexes_to_delete=((6,9),(20,22),(2,4))
and a sequence that I can open using Biopython :
Sequence1 = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
and from indexes_to_delete
file I would like to remove the part from :
6 to 9
20 to 22
and
2 to 4
so if I follow these coordinate I should have a new_sequence
:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
so if I remove the coordinates I get :
A E J K L M N O P Q R S W X Y Z
1 5 10 11 12 13 14 15 16 17 18 19 23 24 25 26
Upvotes: 1
Views: 161
Reputation: 6613
Here is another approach using several modules.
from string import ascii_uppercase
from intspan import intspan
from operator import itemgetter
indexes_to_delete=((6,9),(20,22),(2,4))
# add dummy 'a' so count begins with 1 for uppercase letters
array = ['a'] + list(ascii_uppercase)
indexes_to_keep = intspan.from_ranges(indexes_to_delete).complement(low = 1, high=26)
slice_of = itemgetter(*indexes_to_keep)
print(' '.join(slice_of(array)))
print(' '.join(map(str,indexes_to_keep)))
Prints:
A E J K L M N O P Q R S W X Y Z
1 5 10 11 12 13 14 15 16 17 18 19 23 24 25 26
Upvotes: 1
Reputation: 1001
A bit more readable version:
indexes_to_delete=((6,9),(20,22),(2,4))
Sequence1 = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
newSequence1 = ""
for idx, char in enumerate(Sequence1):
for startIndex, endIndex in indexes_to_delete:
if startIndex <= idx+1 <= endIndex:
break
else:
newSequence1 += char
print(newSequence1)
Prints: AEJKLMNOPQRSWXYZ
Upvotes: 1
Reputation: 177
def delete_indexes(sequence, indexes_to_delete):
# first convert the sequence to a dictionary
seq_dict = {i+1: sequence[i] for i in range(len(sequence))}
# collect all the keys that need to be removed
keys_to_delete = []
for index_range in indexes_to_delete:
start, end = index_range
keys_to_delete += range(start, end+1)
if not keys_to_delete:
return seq_dict
# reomove the keys from the original dictionary
for key in keys_to_delete:
seq_dict.pop(key)
return seq_dict
You can use this function to get the new sequence.
new_sequence = delete_indexes(Sequence1, indexes_to_delete)
Of course, the new_sequence is still a python dictionary. You can convert it to list
or str
, or whatever. For example, to convert it into a str
as the old Sequence1
:
print(''.join(list(new_sequence.values())))
Out[7]:
AEJKLMNOPQRSWXYZ
You can get their coordinates using new_sequence.keys()
.
Upvotes: 1
Reputation: 195543
indexes_to_delete=((6,9),(20,22),(2,4))
Sequence1 = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
s = ''.join(ch for i, ch in enumerate(Sequence1, 1) if not any(a <= i <= b for a, b in indexes_to_delete))
print(s)
Prints:
AEJKLMNOPQRSWXYZ
Upvotes: 2