Reputation: 8109
Example:
list = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
I want to check all 2nd element of every sublist if they are repeating.
I want to keep the entire sublist if the 2nd element is repeating in another sublist
Expected results:
endlist = [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['7', '12/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
(08/12/2016 | 10/12/2016 | 12/12/2016 are the doubles)
I know how to keep doubles with a flat list ([x for x in l if l.count(x) > 1])
but how to do this in a list with sublists?
Upvotes: 0
Views: 172
Reputation: 51998
You can gather count information into a dictionary and then use that. This will scale nicely if the lists are large:
myList = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
d = dict()
for subList in myList:
if subList[1] in d:
d[subList[1]] += 1
else:
d[subList[1]] = 1
doubles = [subList for subList in myList if d[subList[1]] >= 2]
You can of course replace >=2
by ==2
if you want doubles
to exclude triples, etc.
On edit: if you want to keep on the first occurrence of each doubled sublist, modify the dictionary so that it stores the index of each element. Something like this:
d = dict()
for i,subList in enumerate(myList):
if subList[1] in d:
d[subList[1]].append(i)
else:
d[subList[1]] = [i]
firsts = [subList for i,subList in enumerate(myList) if len(d[subList[1]]) >= 2 and i == d[subList[1]][0]]
print(firsts) #prints [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['5', '08/12/2016', [42, 52]]]
On further edit: Here is a solution that removes subsequent doubles:
d = dict()
for i,subList in enumerate(myList):
if not subList[1] in d:
d[subList[1]] = i #stores first index
noDoubles = [subList for i,subList in enumerate(myList) if i == d[subList[1]]]
Upvotes: 1
Reputation: 15423
You can use list comprehensions:
lst = [['1', '13/12/2016', [42, 52]], ['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['6', '07/12/2016', [32, 42]], ['7', '12/12/2016', [42, 52]], ['8', '06/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
endlist = [sublist for sublist in lst if sum(x[1] == sublist[1] for x in lst) > 1]
# [['2', '12/12/2016', [36, 46]], ['4', '10/12/2016', [13, 23]], ['4', '10/12/2016', [42, 52]], ['5', '08/12/2016', [42, 52]], ['7', '12/12/2016', [42, 52]], ['10', '12/12/2016', [45, 55]], ['11', '08/12/2016', [42, 52]]]
Upvotes: 2