Reputation: 421
I'm currently looking for an algorithm to be able to go through a list such as the following one: [1,1,1,1,2,3,4,5,5,5,3,2]
I want, in this example, to be able to select the first "1" as there's a duplicate next to it, and keep going through the list until finding the next number having a duplicate next to it, and then select the last number of this one (ie. "5" in this example).
Eventually, make the difference between these 2 numbers (ie. 5-1)
I have this code at the moment:
i=0
for i in range(len(X)):
if (X[i] == X[i+1]):
first_number = X[i]
elif (X[i] != X[i+1]):
i+=1
I'd like to add a further condition to my question. Suppose you have the following list: lst=[1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2,4,3] In this case, I'll get the following differences according to your code = lst = [4,-2,-1] and then stops. However, I'd like "4-2" to be added to the list afterwards because "4" is followed by a number less than "4" (thus, going to the opposite direction - up - of what "2" followed "4" were following). I hope this is clear enough. Many thanks
Upvotes: 0
Views: 109
Reputation: 82889
You can use itertools.groupby
to find groups of repeating numbers, then find the difference between the first two of those:
>>> import itertools
>>> lst = [1,1,1,1,2,3,4,5,5,5,3,2]
>>> duplicates = [k for k, g in itertools.groupby(lst) if len(list(g)) > 1]
>>> duplicates[1] - duplicates[0]
4
Or use duplicates[-1] - duplicates[0]
if you want the difference between the first and the last repeated number.
In the more general case, if you want the difference between all pairs of consecutive repeated numbers, you could combine that with zip
:
>>> lst = [1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2]
>>> duplicates = [k for k, g in itertools.groupby(lst) if len(list(g)) > 1]
>>> duplicates
[1, 5, 3, 2]
>>> [x - y for x,y in zip(duplicates, duplicates[1:])]
[-4, 2, 1]
I think now I got what you want: You want the difference between any consecutive "plateaus" in the list, where a plateau is either a repeated value, or a local minimum or maximum. This is a bit more complicated and will take several steps:
>>> lst=[1,1,1,1,2,3,4,5,5,5,3,3,3,3,2,2,2,4,3]
>>> plateaus = [lst[i] for i in range(1, len(lst)-1) if lst[i] == lst[i-1]
... or lst[i-1] <= lst[i] >= lst[i+1]
... or lst[i-1] >= lst[i] <= lst[i+1]]
>>> condensed = [k for k, g in itertools.groupby(plateaus)]
>>> [y-x for x, y in zip(condensed, condensed[1:])]
[4, -2, -1, 2]
Upvotes: 0
Reputation: 76
Solution:
def subDupeLimits( aList ):
dupList = []
prevX = None
for x in aList:
if x == prevX:
dupList.append(x) # track duplicates
prevX = x # update previous x
# return last duplicate minus first
return dupList[-1] - dupList[0]
# call it
y = subDupeLimits( [1,1,1,1,2,3,4,5,5,5,3,2] )
# y = 4
Upvotes: 1
Reputation: 78546
You can use enumerate
with a starting index of 1
. Duplicates are detected if the current value is equal to the value at the previous index:
l = [1,1,1,1,2,3,4,5,5,5,3,2]
r = [v for i, v in enumerate(l, 1) if i < len(l) and v == l[i]]
result = r[-1] - r[0]
# 4
The list r
is a list of all duplicates. r[-1]
is the last item and r[0]
is the first.
More trials:
>>> l= [1,1,5,5,5,2,2]
>>> r = [v for i, v in enumerate(l, 1) if i < len(l) and v == l[i]]
>>> r[-1] - r[0]
1
Upvotes: 1