Reputation: 1280
I have a list or lists in python similar to the following:
[
['name1',value2],
['name2',value3],
['name3',value4],
['name4',value4],
['name5',value5],
['name6',value2],
['name7',value2],
['name8',value4]
]
I want to remove any list within the list that has more than 2 duplicates from the 'value' field. The resulting list would look like:
[
['name1',value2],
['name2',value3],
['name3',value4],
['name4',value4],
['name5',value5],
['name6',value2]
]
Edit:
I didn't think this would be a problem so kept it simple for a clear question, but i actually have four values and not two in each internal list. I.E:
[
['name1',value2,'something','else'],
['name2',value3,'something','else'],
['name3',value4,'something','else'],
['name4',value4,'something','else'],
['name5',value5,'something','else'],
['name6',value2,'something','else']
]
Ashwini Chaudhary's answer works perfectly but only returns the two first element and not all four... my fault for not adding the complete details. Lesson learned!
Upvotes: 3
Views: 99
Reputation: 250881
if order doesn't matters:
In [14]: lis=[
['name1','value2','something','else'],
['name2','value3','something','else'],
['name3','value4','something','else'],
['name4','value4','something','else'],
['name5','value5','something','else'],
['name6','value2','something','else']
]
In [22]: dic={}
In [23]: for x in lis:
dic.setdefault(x[1],[]).append([x[0]]+x[2:])
....:
....:
In [25]: dic
Out[25]:
{'value2': [['name1', 'something', 'else'], ['name6', 'something', 'else']],
'value3': [['name2', 'something', 'else']],
'value4': [['name3', 'something', 'else'], ['name4', 'something', 'else']],
'value5': [['name5', 'something', 'else']]}
In [27]: [[y[0]]+[x]+y[1:] for x in dic for y in dic[x][:2]]
Out[27]:
[['name5', 'value5', 'something', 'else'],
['name3', 'value4', 'something', 'else'],
['name4', 'value4', 'something', 'else'],
['name2', 'value3', 'something', 'else'],
['name1', 'value2', 'something', 'else'],
['name6', 'value2', 'something', 'else']]
Upvotes: 1
Reputation: 4097
from collections import defaultdict
list1 = [['name1','value2'],
['name2','value3'],
['name3','value4'],
['name4','value4'],
['name5','value5'],
['name6','value2'],
['name7','value2'],
['name8','value4']]
list2 = [['name1','value2'],
['name2','value3'],
['name3','value4'],
['name4','value4'],
['name5','value5'],
['name6','value2']]
d = defaultdict(list)
for name, value in list1:
d[value].append(name)
list3 = [[name, value] for value, names in d.items() for name in names[:2]]
print(sorted(list3) == sorted(list2)) # True
I am certain that someone will come up with a better solution that preserves order and works as an iterator.
Upvotes: 0
Reputation: 1767
This code do the trick:
from collections import defaultdict
def dup2(sequence):
seen = defaultdict(int)
for key, value in sequence:
if seen[value] < 2:
seen[value] += 1
yield [key, value]
dup2
is a generator, so it process list as you iterate over result:
for key, value in dup2(seq):
# ... your code here
To get result as plain list, use list
function:
list(dup2(seq))
Upvotes: 2