J Cena
J Cena

Reputation: 1063

How to remove list elements within a loop effectively in python

I have a code as follows.

for item in my_list:
        print(item[0])
        temp = []
        current_index = my_list.index(item)
        garbage_list = creategarbageterms(item[0])

        for ele in my_list:
            if my_list.index(ele) != current_index:
                for garbage_word in garbage_list:
                    if garbage_word in ele:
                        print("concepts: ", item, ele)
                        temp.append(ele)
        print(temp)

Now, I want to remove the ele from mylist when it gets appended to temp (so, that it won't get processed in the main loop, as it is a garbage word).

I know it is bad to remove elements straightly from the list, when it is in a loop. Thus, I am interested in knowing if there is any efficient way of doing this?

For example, if mylist is as follows;

    mylist = [["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["pudding", 298.2], 
["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["berry_tim_tam", 171.9], 
["tiramusu", 158.4], ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]]

1st iteration

for the first element tim_tam, I get garbage words such as yummy_tim_tam and berry_tim_tam. So they will get added to my temp list.

Now I want to remove yummy_tim_tam and berry_tim_tam from the list (because they have already added to temp), so that it won't execute from the beginning.

2nd iteration

Now, since yummy_tim_tam is no longer in the list this will execute pudding. For pudding I get a diffrent set of garbage words such as chocolate_pudding, biscuits, tiramu. So, they will get added to temp and will get removed.

3rd iteration

ice_cream will be selected. and the process will go on.

My final objective is to get three separate lists as follows.

["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["berry_tim_tam", 171.9] , ["pudding", 298.2]

["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["tiramusu", 158.4]

["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]

Upvotes: 4

Views: 153

Answers (3)

joaquin
joaquin

Reputation: 85683

This code produces what you want:

my_list = [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['pudding', 298.2],
           ['chocolate_pudding', 218.4], ['biscuits', 178.2], ['berry_tim_tam', 171.9],
           ['tiramusu', 158.4], ['ice_cream', 141.6], ['vanilla_ice_cream', 122.39]
           ]

creategarbageterms = {'tim_tam' : ['tim_tam','yummy_tim_tam', 'berry_tim_tam'],
                      'pudding': ['pudding', 'chocolate_pudding', 'biscuits', 'tiramusu'],
                      'ice_cream': ['ice_cream', 'vanilla_ice_cream']}

all_data = {}
temp = []
for idx1, item in enumerate(my_list):
    if item[0] in temp: continue
    all_data[idx1] = [item]

    garbage_list = creategarbageterms[item[0]]

    for idx2, ele in enumerate(my_list):
        if idx1 != idx2:
            for garbage_word in garbage_list:
                if garbage_word in ele:
                    temp.append(ele[0])
                    all_data[idx1].append(ele)

for item in all_data.values():
    print('-', item)  

This produces:

- [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]]
- [['pudding', 298.2], ['chocolate_pudding', 218.4], ['biscuits', 178.2], ['tiramusu', 158.4]]
- [['ice_cream', 141.6], ['vanilla_ice_cream', 122.39]]  

Note that for the purpose of the example I created a mock creategarbageterms function (as a dictionary) that produces the term lists as you defined it in your post. Note the use of a defaultdict which allows unlimited number of iterations, that is, unlimited number of final lists produced.

Upvotes: 3

Patrick Artner
Patrick Artner

Reputation: 51683

I would propose to do it like this:

mylist = [["tim_tam", 879.3000000000001],   
          ["yummy_tim_tam", 315.0],
          ["pudding", 298.2], 
          ["chocolate_pudding", 218.4], 
          ["biscuits", 178.20000000000002],
          ["berry_tim_tam", 171.9], 
          ["tiramusu", 158.4], 
          ["ice_cream", 141.6], 
          ["vanilla_ice_cream", 122.39999999999999]]

d = set()   # remembers unique keys, first one in wins

for i in mylist:
    shouldAdd = True
    for key in d:
        if i[0].find(key) != -1:    # if this key is part of any key in the set
            shouldAdd = False       # do not add it

    if not d or shouldAdd:          # empty set or unique: add to set
        d.add(i[0]) 

myCleanList = [x for x in mylist if x[0] in d]    # clean list to use only keys in set

print(myCleanList)

Output:

[['tim_tam', 879.3000000000001], 
 ['pudding', 298.2], 
 ['biscuits', 178.20000000000002], 
 ['tiramusu', 158.4], 
 ['ice_cream', 141.6]]

If the order of things in the list is not important, you could use a dictionary directly - and create a list from the dict.

If you need sublists, create them:

similarThings = [ [x for x in mylist if x[0].find(y) != -1] for y in d]

print(similarThings)

Output:

[
    [['tim_tam', 879.3000000000001], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]], 
    [['tiramusu', 158.4]], 
    [['ice_cream', 141.6], ['vanilla_ice_cream', 122.39999999999999]], 
    [['pudding', 298.2], ['chocolate_pudding', 218.4]], 
    [['biscuits', 178.20000000000002]]
]

As @joaquin pointed out in the comment, I am missing the creategarbageterms() functions that groups tiramusu and biscuits with pudding to fit the question 100% - my answer is advocating "do not modify lists in interations, use appropriate set or dict filter it to the groups. Unique keys here are keys that are not parts of later mentioned keys.

Upvotes: 2

Dennis Soemers
Dennis Soemers

Reputation: 8518

You want to have an outer loop that's looping through a list, and an inner loop that can modify that same list.

I saw you got suggestions in the comments to simply not remove entries during the inner loop at all, but instead check if terms already are in temp. This is possible, and may be easier to read, but is not necessarily the best solution with respect to processing time.

I also see you received an answer from Patrick using dictionaries. This is probably the cleanest solution for your specific use-case, but does not address the more general question in your title which is specifically about removing items in a list while looping through it. If for whatever reason this is really necessary, I would propose the following:

idx = 0
while idx < len(my_list)
    item = my_list[idx]
    print(item[0])
    temp = []
    garbage_list = creategarbageterms(item[0])

    ele_idx = 0
    while ele_idx < len(my_list):
        if ele_idx != idx:
            ele = my_list[ele_idx]
            for garbage_word in garbage_list:
                if garbage_word in ele:
                    print("concepts: ", item, ele)
                    temp.append(ele)
                    del my_list[ele_idx]
        ele_idx += 1
    print(temp)
    idx += 1

The key insight here is that, by using a while loop instead of a for loop, you can take more detailed, ''manual'' control of the control flow of the program, and more safely do ''unconventional'' things in your loop. I'd only recommend doing this if you really have to for whatever reason though. This solution is closer to the literal question you asked, and closer to your original own code, but maybe not the easiest to read / most Pythonic code.

Upvotes: 1

Related Questions