Reman
Reman

Reputation: 8109

Keep only sublist in list if the 2nd elements are repeating

Example:

list = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]

I want to check all 2nd element of every sublist if they are repeating.
I want to keep the entire sublist if the 2nd element is repeating in another sublist

Expected results:

endlist = [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['7', '12/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]

(08/12/2016 | 10/12/2016 | 12/12/2016 are the doubles)

I know how to keep doubles with a flat list ([x for x in l if l.count(x) > 1]) but how to do this in a list with sublists?

Upvotes: 0

Views: 172

Answers (2)

John Coleman
John Coleman

Reputation: 51998

You can gather count information into a dictionary and then use that. This will scale nicely if the lists are large:

myList = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]

d = dict()
for subList in myList:
    if subList[1] in d:
        d[subList[1]] += 1
    else:
        d[subList[1]] = 1

doubles = [subList for subList in myList if d[subList[1]] >= 2]

You can of course replace >=2 by ==2 if you want doubles to exclude triples, etc.

On edit: if you want to keep on the first occurrence of each doubled sublist, modify the dictionary so that it stores the index of each element. Something like this:

d = dict()
for i,subList in enumerate(myList):
    if subList[1] in d:
        d[subList[1]].append(i)
    else:
        d[subList[1]] = [i]

firsts = [subList for i,subList in enumerate(myList) if len(d[subList[1]]) >= 2 and i == d[subList[1]][0]]
print(firsts) #prints [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['5', '08/12/2016', [42, 52]]]

On further edit: Here is a solution that removes subsequent doubles:

d = dict()

for i,subList in enumerate(myList):
    if not subList[1] in d:
        d[subList[1]] = i #stores first index

noDoubles = [subList for i,subList in enumerate(myList) if i == d[subList[1]]]

Upvotes: 1

Julien Spronck
Julien Spronck

Reputation: 15423

You can use list comprehensions:

lst = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
endlist = [sublist for sublist in lst if sum(x[1] == sublist[1] for x in lst) > 1]
# [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['7', '12/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]

Upvotes: 2

Related Questions