Reputation: 103
I am trying to find an efficient way to search three or more consecutive duplicates and replace them for only one in a Python list.
list_before = [1, 1, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 8]
# expected
list_after = [1, 2, 3, 4, 5, 6, 6, 7, 8]
def replace(list_to_replace):
for idx, val in enumerate(list_to_replace):
if idx + 3 < len(list_to_replace):
if val == list_to_replace[idx+1] == list_to_replace[idx+2]:
del list_to_replace[idx+1]
del list_to_replace[idx+2]
return list_to_replace
>>> replace(list_before)
[1, 1, 3, 4, 5, 5, 6, 7, 7, 8, 8, 8]
What seems to be the problem here? Is there a more efficient way?
Upvotes: 6
Views: 553
Reputation: 133554
>>> from itertools import groupby
>>> nums = [1, 1, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 8]
>>> [k for k, g in groupby(nums) for i in range(1 + (len(list(g)) == 2))]
[1, 2, 3, 4, 5, 6, 6, 7, 8]
Upvotes: 2
Reputation: 11193
Try this way, defining a custom method to slice the array based on condition:
def take_max_three(iterable):
iterable = sorted(iterable) # It requires the iterable to be sorted, remove if already sorted
i, x, size = 0, 0, len(iterable)
while i < size-1:
if iterable[i] < iterable[i+1]:
ready = iterable[x:i+1]
if len(ready) <= 3:
yield ready
else:
yield ready[0:3]
x = i + 1
i += 1
yield iterable[x:x+3]
Then just call the method on the array, this is a slight modified array:
array = [1, 1, 2, 3, 4, 5, 5, 1, 5, 6, 6, 6, 7, 3, 7, 7, 8, 8, 8, 8, 8, 9]
take_max_three(array)
# => [[1, 1, 1], [2], [3, 3], [4], [5, 5, 5], [6, 6, 6], [7, 7, 7], [8, 8, 8], [9]]
You could further customise the method passing the number of elements to take.
Upvotes: 0
Reputation: 14169
As pointed out by Chris in his answer, a one-liner is possible but it's not pretty at all.
In [88]: list(chain.from_iterable([(x,) if len(y) >= 3 else y for x, y in [(k, tuple(g)) for k, g in groupby(list_before)]]))
Out[88]: [1, 2, 3, 4, 5, 6, 6, 7, 8]
I think there should be a better way but chain
is hacky enough to deal with when dealing with non-iterables.
Upvotes: 1
Reputation: 5224
Just to add an object oriented approach, that I used on streams :
class StreamCount:
def __init__(self, input_values):
self.input_values = input_values
self.output = []
self.current_value = next(iter(input_values), None) # first element if there is any
self.current_count = 0
def process_new(self, value):
if value == self.current_value:
self.current_count += 1
else:
self.update_output()
self.current_count = 1
self.current_value = value
def process_all(self):
for v in self.input_values:
self.process_new(v)
# handle last values suite
self.update_output()
return self.output
def update_output(self):
if self.current_count > 2:
self.output.append(self.current_value)
else:
self.output += [self.current_value for _ in range(self.current_count)]
Tests
input_values = [1, 1, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 8]
StreamCount(input_values).process_all()
ouput: [1, 2, 3, 4, 5, 6, 6, 7, 8]
input_values = []
ouput: []
input_values = [None]
ouput: [None]
Upvotes: 0
Reputation: 30
This is my solution:
list_before = [1, 5, 7, 8, 6, 1, 4, 5, 6, 7, 1, 8, 8, 5, 2, 3, 7, 8, 8]
list_after = []
for item in list_before:
if not item in list_after:
list_after.append(item)
Upvotes: -2
Reputation: 41168
I good use case for itertools.groupby
:
>>> from itertools import groupby
>>> list_before = [1, 1, 1, 2, 3, 4, 5, 5, 5, 6, 6, 7, 7, 7, 8, 8, 8, 8, 8]
>>> list_after = []
>>> for k, group in groupby(list_before):
... lst = list(group)
... if len(lst) >= 3:
... list_after.append(k)
... else:
... list_after.extend(lst)
>>> list_after
[1, 2, 3, 4, 5, 6, 6, 7, 8]
It would be possible make a one-liner with itertools.chain
but the for
loop is almost certainly more readable and similarly performant.
Upvotes: 10