Delete n occurrences of an element. Reorder list to match expected output

Question

I was able to get the list to contain the correct number of occurrences for each element; however, the output I am getting is in the wrong order. So my question is: how would I reorder this to match the expected output? (Or write code that outputs what's expected. I would prefer help with fixing the code I wrote, however.)

My code:

def delete_nth(order,max_e):
    for i in order:
        if i in order:
            if order.count(i)>max_e:
                order.remove(i)
    return order

My Inputs:

order = [1, 2, 3, 1, 1, 2, 1, 2, 3, 3, 2, 4, 5, 3, 1], max_e =  3

My output:

[1, 2, 1, 2, 3, 3, 2, 4, 5, 3, 1]

Should equal:

[1, 2, 3, 1, 1, 2, 2, 3, 3, 4, 5]

The prompt:

Alice and Bob were on a holiday. Both of them took many pictures of the places they've been, and now they want to show Charlie their entire collection. However, Charlie doesn't like these sessions, since the motive usually repeats. He isn't fond of seeing the Eiffel tower 40 times. He tells them that he will only sit during the session if they show the same motive at most N times. Luckily, Alice and Bob are able to encode the motive as a number. Can you help them to remove numbers such that their list contains each number only up to N times, without changing the order?

Task

Given a list lst and a number N, create a new list that contains each number of lst at most N times without reordering. For example if N = 2, and the input is [1, 2, 3, 1, 2, 1, 2, 3], you take [1, 2, 3, 1, 2], drop the next [1, 2] since this would lead to 1 and 2 being in the result 3 times, and then take 3, which leads to [1, 2, 3, 1, 2, 3].

JonSG · Accepted Answer

Let's take a look at your code first:

def delete_nth(order, max_e):
    for i in order:
        if i in order:                  # this is (or should be) redundant
            if order.count(i) > max_e:  # note this scans order many times
                order.remove(i)         # removes the "first" occurrence not the current
    return order

Word of warning, removing items from a list while iterating over it is full of pitfalls. For example after we do a remove() and go to the next iteration, are we sure what the "next" i is going to be?

anyways, the main problem is we are removing the items from the start of the list rather than from the end of it. There is also the issue of scanning items many times but we can look at that later.

One option then might be to work on a reversed version of the list:

NOTE: do not use delete_nth_v2() it is just here to illustrate a point

def delete_nth_v2(order, max_e):
    order_reversed = list(reversed(order))
    for i in order_reversed:
        if order_reversed.count(i) > max_e:
            order_reversed.remove(i)
    return list(reversed(order_reversed))

This looks like it will do the trick a little better, but we actually still have the problem that i is likely NOT what we expect it to be. to see this, add in a print() statement:

def delete_nth_v2(order, max_e):
    order_reversed = list(reversed(order))
    for i in order_reversed:
        print(i, order_reversed)
        if order_reversed.count(i) > max_e:
            order_reversed.remove(i)
    return list(reversed(order_reversed))

delete_nth_v2([1,2,3,1,2,3,1,2,3,4], 2)

you will see that on the 3rd iteration, we skip what we might have hoped was i == 2 Bummer :-(

Perhaps though there is a way we can track indexes more manually ourselves. This might allow us to also avoid reversing the lists

def delete_nth_v3(order, max_e):
    for index in range(len(order) -1, -1, -1):
        if order.count(order[index]) > max_e:
            order.pop(index)
    return order

Now we are getting someplace. This even produces the "correct" result :-) This seems better and is inline with how we started, but there is still the nagging opportunity to not search the entire list for each item in the list.

Why don't we just build a new list while keeping track of how many of each item we have already seen. This trades a little extra space to avoid the repeated searches.

def delete_nth_v4(items, at_most):
    counts = {}
    keepers = []
    for item in items:
        if counts.setdefault(item, 0) + 1 <= at_most:
            counts[item] += 1
            keepers.append(item)
    return keepers

Again we get the correct result and this time it is potentially much faster (at least with larger lists) :-)

Finally, if it was me, I would duck being responsible for the space of the "new" list and I would look at yielding the results back to the caller. I would also probably swap out setdefault() for a collections.defaultdict()

import collections
def delete_nth_v5(items, at_most):
    counts = collections.defaultdict(int)
    for item in items:
        if counts[item] < at_most:
            counts[item] += 1
            yield item

Note we can verify the equivalence via:

import random
motives = [random.randint(0, 100) for _ in range(1_000)]
print(list(delete_nth_v5(motives, 2)) == delete_nth_v4(motives, 2))

Delete n occurrences of an element. Reorder list to match expected output

Answers (2)

Related Questions