Reputation: 2587
I am trying to get a unique list of objects, I have some code that pulls data from an API and then puts that data into an object. I then put those objects in a list. however some of the objects are duplicates and I would like to know how to remove them?
sample list data:
[
Policy: 'SQL',
SecondaryPolicy: 'ORACLE',
Level: 'Primary On Call Engineer',
LevelNo: 1,
StartDate: None,
EndDate: None,
StartTime: None,
EndTime: None,
Name: 'Fred',
Mobile: '123',
Policy: 'Comms',
SecondaryPolicy: '',
Level: 'Primary On Call Engineer',
LevelNo: 1,
StartDate: None,
EndDate: None,
StartTime: None,
EndTime: None,
Name: 'Bob',
Mobile: '456',
Policy: 'Infra',
SecondaryPolicy: '',
Level: 'Primary On Call Engineer',
LevelNo: 1,
StartDate: None,
EndDate: None,
StartTime: None,
EndTime: None,
Name: 'Bill',
Mobile: '789',
Policy: 'Comms',
SecondaryPolicy: '',
Level: 'Primary On Call Engineer',
LevelNo: 1,
StartDate: None,
EndDate: None,
StartTime: None,
EndTime: None,
Name: 'Bob',
Mobile: '456',
]
code (ive removed some of the object data and put in sample data, for this test im just trying to get freds result returned once)
objPolicyData = getUserData()
OnCallData = []
for UserItem in objPolicyData['users']:
UserData = User()
#get the user object from DB
UserData.Name = 'Fred'
for OnCall in UserItem['on_call']:
UserPolicy = OnCall['escalation_policy']
UserData.Policy = 'SQL'
UserData.SecondaryPolicy = 'ORACLE'
OnCallData.append(UserData)
attempts: i tried this
clean_on_call_data = {User.Name for User in OnCallData}
but this only prints
set(['Fred'])
where are the other fields in the objects, and how would i iterate it?
EDIT: this is my class, is the cmp correct? how do i remove the duplicate?
class User(object):
__attrs = ['Policy','SecondaryPolicy','Name']
def __init__(self, **kwargs):
for attr in self.__attrs:
setattr(self, attr, kwargs.get(attr, None))
def __repr__(self):
return ', '.join(
['%s: %r' % (attr, getattr(self, attr)) for attr in self.__attrs])
def __cmp__(self):
if self.Name != other.Name:
Upvotes: 2
Views: 207
Reputation: 2424
For Python 2.x
I think you'll want to implement __cmp__
for your class that stores the API data.
For Python 3.x
I think you'll want to implement __eq__
and __hash__
for your class that stores the API data.
Regardless of which version of Python, you can use the comparator / eq method to check for duplicates in your list. This can be done by utilizing set(list)
, if you defined __eq__
. As a set is a list of unique objects.
Upvotes: 2
Reputation: 2520
You could subclass the User
class and implement __eq__
and __hash__
method, then just add those to a set
, like this:
class UserUnique(User):
def __hash__(self):
return hash(self.Name)
def __eq__(self, o):
return self.Name == o.Name
Then you can do like this:
OnCallData = set()
for UserItem in objPolicyData['users']:
UserData = UserUnique()
UserData.Name = 'Fred'
for OnCall in UserItem['on_call']:
UserPolicy = OnCall['escalation_policy']
UserData.Policy = 'SQL'
UserData.SecondaryPolicy = 'ORACLE'
OnCallData.add(UserData)
Upvotes: 0
Reputation: 3497
How about using dictionaries and then a pandas.DataFrame
?
Something like:
d1 = {
'Policy': 'SQL',
'SecondaryPolicy': 'ORACLE',
'Level': 'Primary On Call Engineer',
'LevelNo': 1,
'StartDate': None,
'EndDate': None,
'StartTime': None,
'EndTime': None,
'Name': 'Fred',
'Mobile': '123',
}
d2 = {
'Policy': 'Comms',
'SecondaryPolicy': '',
'Level': 'Primary On Call Engineer',
'LevelNo': 1,
'StartDate': None,
'EndDate': None,
'StartTime': None,
'EndTime': None,
'Name': 'Bob',
'Mobile': '456',
}
d3 = {
'Policy': 'Infra',
'SecondaryPolicy': '',
'Level': 'Primary On Call Engineer',
'LevelNo': 1,
'StartDate': None,
'EndDate': None,
'StartTime': None,
'EndTime': None,
'Name': 'Bill',
'Mobile': '789',
}
d4 = {
'Policy': 'Comms',
'SecondaryPolicy': '',
'Level': 'Primary On Call Engineer',
'LevelNo': 1,
'StartDate': None,
'EndDate': None,
'StartTime': None,
'EndTime': None,
'Name': 'Bob',
'Mobile': '456',
}
data = pd.DataFrame([d1,d2,d3,d4])
data[ data.Name=='Fred' ]
Which outs:
Upvotes: 0