Reputation: 1946
I have a pandas data frame that contains thousands of dates and IDs as follows:
2/1/18 123
2/1/18 123
2/1/18 456
2/1/18 789
I also have a short list of only a few IDs such as:
ids=['123','456','909']
I need to loop through the list and get a count of each value in the list by date. So the result should look like;
2/1/18 123 2
456 1
909 0
I can do a groupby easy enough on the data frame but that will return all ids and I only need the values from the id list.
Upvotes: 2
Views: 1670
Reputation: 323226
I am using reindex
here
df.loc[df.Id.isin(ids)].groupby('Date').Id.value_counts().reindex(index=pd.MultiIndex.from_product([df.Date.unique(),ids]),fill_value=0)
Out[1116]:
2/1/18 123 2
456 1
909 0
Name: Id, dtype: int64
Upvotes: 1
Reputation: 6543
Is this what you're after?
import pandas as pd
df = pd.DataFrame([['2/1/18', 123],
['2/1/18', 123],
['2/1/18', 456],
['2/1/18', 789]],
columns=['Date', 'ID'])
ids = ['123','456','909']
df['count'] = 1
results = df[df['ID'].isin(ids)].groupby(['Date', 'ID']).count()
Upvotes: 4