Reputation: 197
I have this array of data
data = [20001202.05, 20001202.05, 20001202.50, 20001215.75, 20021215.75]
I remove the duplicate data with list(set(data))
, which gives me
data = [20001202.05, 20001202.50, 20001215.75, 20021215.75]
But I would like to remove the duplicate data, based on the numbers before the "period"; for instance, if there is 20001202.05
and 20001202.50
, I want to keep one of them in my array.
Upvotes: 1
Views: 8440
Reputation: 15755
Generically, with python 3.7+, because dictionaries maintain order, you can do this, even when order matters:
data = {d:None for d in data}.keys()
However for OP's original problem, OP wants to de-dup based on the integer value, not the raw number, so see the top voted answer. But generically, this will work to remove true duplicates.
Upvotes: 4
Reputation: 141
data1 = [20001202.05, 20001202.05, 20001202.50, 20001215.75, 20021215.75]
for i in data1:
if i not in ls:
ls.append(i)
print ls
Upvotes: 1
Reputation: 3410
As you don't care about the order of the items you keep, you could do:
>>> {int(d):d for d in data}.values()
[20001202.5, 20021215.75, 20001215.75]
If you would like to keep the lowest item, I can't think of a one-liner.
Here is a basic example for anybody who would like to add a condition on the key or value to keep.
seen = set()
result = []
for item in sorted(data):
key = int(item) # or whatever condition
if key not in seen:
result.append(item)
seen.add(key)
Upvotes: 11