Reputation: 193
I have a list of employees who work at different times of the day, what I would like is to count the number of days every guys has worked, like so:
FOO : 3 BAZ : 3 NOM : 1 etc....
this is how I receive the raw data:
my_list = [('NOM', datetime.date(2030, 1, 1)),
('BAR', datetime.date(2019, 4, 8)),
('HAM', datetime.date(2019, 4, 8)),
('FOO', datetime.date(2019, 4, 8)),
('BAZ', datetime.date(2019, 4, 8)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11))]
I managed to strip the list to unique days for each individual like so :
a = Counter(set(dictio))
which gets rid of the duplicates for one guy in a day:
Counter({('HAM', datetime.date(2019, 4, 8)): 1,
('HAM', datetime.date(2019, 4, 10)): 1,
('HAM', datetime.date(2019, 4, 11)): 1,
('BAR', datetime.date(2019, 4, 8)): 1,
('BAR', datetime.date(2019, 4, 10)): 1,
('BAR', datetime.date(2019, 4, 11)): 1,
('FOO', datetime.date(2019, 4, 8)): 1,
('FOO', datetime.date(2019, 4, 10)): 1,
('FOO', datetime.date(2019, 4, 11)): 1,
('BAZ', datetime.date(2019, 4, 8)): 1,
('BAZ', datetime.date(2019, 4, 10)): 1,
('BAZ', datetime.date(2019, 4, 11)): 1,
('NOM', datetime.date(2030, 1, 1)): 1})
This is where I'm stuck: I do I go from the above to:
HAM:3
BAR:3
FOO:3
BAZ:3
NOM:1
Upvotes: 1
Views: 96
Reputation: 791
Convert List to a Pandas Dataframe, Drop duplicates and Group by Name
import datetime
import pandas as pd
my_list = [('NOM', datetime.date(2030, 1, 1)),
('BAR', datetime.date(2019, 4, 8)),
('HAM', datetime.date(2019, 4, 8)),
('FOO', datetime.date(2019, 4, 8)),
('BAZ', datetime.date(2019, 4, 8)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('HAM', datetime.date(2019, 4, 10)),
('FOO', datetime.date(2019, 4, 10)),
('BAR', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 10)),
('BAZ', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11)),
('FOO', datetime.date(2019, 4, 11)),
('BAZ', datetime.date(2019, 4, 11)),
('BAR', datetime.date(2019, 4, 11)),
('HAM', datetime.date(2019, 4, 11))]
# COnvert List of Tuples to Dataframe
df = pd.DataFrame(my_list,columns=['name','date'])
#Remove Duplicates
df.drop_duplicates(inplace=True)
#Group by Name Count
df.groupby('name').count()
Upvotes: 1
Reputation: 19885
Use itertools.groupby
:
from itertools import groupby
from operator import itemgetter
result = {key: len(group) for key, group in groupby(sorted(set(my_list)), key=itemgetter(0))}
print(result)
This sorts my_list
by the first element (the names), partitions it into groups based on those names, and finally gets the name and length of each group as a key-value pair in a dict
.
Output:
{'BAR': 3, 'BAZ': 3, 'FOO': 3, 'HAM': 3, 'NOM': 1}
Upvotes: 3
Reputation: 323226
You can do with
import collections
collections.Counter(x for x , y in set(my_list) )
Out[251]: Counter({'BAR': 3, 'BAZ': 3, 'FOO': 3, 'HAM': 3, 'NOM': 1})
Upvotes: 6