JY078
JY078

Reputation: 403

python group strings in lists

I have two lists with the same len.

list1=[0, 1, 1, 1, 0, 0, 0, 1]
list2=['ATT', 'TTC', 'TCC', 'CCC', 'CCA', 'CAA', 'AAA', 'AAG']

Note:

1). The number of elements in each list varies, but the two lists always have the same len.

2). list1 only contains integers 0, or 1. It can only contain 0s, only contain 1s, or contain both 0s and 1s.

3). list2 only contains strings with the same length, also for each string k (k is a index >0), list2[k][:-1]=list2[k-1][1:], meaning that if k=2, list[k]='TCC', list[k][:-1]='TC', list[k-1]='TTC', list[k-1][1:]='TC'. Thus, list2[k][:-1]=list2[k-1][1:].

I want to group the strings in list2 based on their corresponding elements in list1 (corresponding: elements in list1 and list2 sharing the same index). If list1[n]==1, then list2[n-1]+list2[n][-1]. If list1[n]==0, and list1[n+1]==0, do nothing.

e.g. the corresponding string of the first integer 1 in list2 is 'TTC', then 'ATT', and 'TTC' become 'ATTC'. The corresponding string of the second integer 1 in list2 is 'TCC', then 'ATTC' (Note; already combined), and 'TCC' become 'ATTCC'. Then list1[3]==1, 'ATTCC' and 'CCC' become 'ATTCCC'. list1[7]==1, then 'AAG' ('AAG'=list2[7]) needs to be grouped with its previous string, 'AAA', the two become 'AAAG'. The only condition that you do nothing is when n=0, and n+1=0. That's why, list1[4]==0, and list1[4+1]==0, 'CCA' stays the same. list1[5]==0, and list1[5+1]==0, then list2[5]='CAA' remains unchanged.

After transformation, This is the output I want. newlist2=['ATTCCC', 'CCA', 'CAA', 'AAAG']

Obviously the code is not correct.

for i in range(1, len(list1)):
    if list1[i]==1:
        newlist2[i-1]+newlist2[i][-1]  
    else:
        do nothing

Upvotes: 0

Views: 2060

Answers (2)

Ajax1234
Ajax1234

Reputation: 71451

You can use itertools.groupby with functools.reduce:

import itertools
import functools
list1=[0, 1, 1, 1, 0, 0, 0, 1]
list2=['ATT', 'TTC', 'TCC', 'CCC', 'CCA', 'CAA', 'AAA', 'AAG']
d = [[a, [h for _, h in b]] for a, b in itertools.groupby(zip(list1, list2), key=lambda x:x[0])]
final_results = [i for b in filter(None, [[functools.reduce(lambda x, y:x+y[-1], [d[i-1][-1][-1]]+d[i][-1])] if d[i][0] else d[i][-1][:-1] for i in range(len(d))]) for i in b]

Output:

['ATTCCC', 'CCA', 'CAA', 'AAAG']

Upvotes: 2

user3483203
user3483203

Reputation: 51165

Is this what you want?:

list1=[0, 1, 1, 1, 0, 0, 0, 1]
list2=['ATT', 'TTC', 'TCC', 'CCC', 'CCA', 'CAA', 'AAA', 'AAG']

indices = []
for i in range(1, len(list1)):
    if list1[i]==1:
        list2[i] = list2[i-1]+list2[i][-1]
        indices.append(i-1)


list2 = [list2[i] for i in range(len(list2)) if i not in indices]
print(list2)

Output:

['ATTCCC', 'CCA', 'CAA', 'AAAG']

I loop through your list, apply your method, and keep track of which indices I need to remove at the end, then I create a new list without the unnecessary indices.

Upvotes: 2

Related Questions