Reputation: 45
I'm very new to pandas library in python and I've been trying to piece together how to take a dataframe like this
'Date' 'Color'
0 '05-10-2017' 'Red'
1 '05-10-2017' 'Green'
2 '05-10-2017' 'Blue'
3 '05-10-2017' 'Red'
4 '05-10-2017' 'Blue'
5 '05-11-2017' 'Red'
6 '05-11-2017' 'Green'
7 '05-11-2017' 'Red'
8 '05-11-2017' 'Green'
9 '05-11-2017' 'Blue'
10 '05-11-2017' 'Blue'
11 '05-11-2017' 'Red'
12 '05-11-2017' 'Blue'
13 '05-11-2017' 'Blue'
14 '05-12-2017' 'Green'
15 '05-12-2017' 'Blue'
16 '05-12-2017' 'Red'
17 '05-12-2017' 'Blue'
18 '05-12-2017' 'Blue'
and output one that has unique dates as an index, the colors as column headers the value count for each day like this:
'Date' 'Red' 'Green' 'Blue'
'05-10-2017' 2 1 2
'05-11-2017' 3 2 3
'05-12-2017' 1 1 3
I have been struggling searching through this site for the past two days trying to piece together a way to achieve this and so far I have only been able to generate the index of unique dates. I'm having some trouble with value_counts. I would appreciate if anybody would be able to show me a method, or point me in the right direction if this has been answered already. I have exhausted my searching capabilities and have finally resolved to ask my first question ever on here. If I'm an idiot, please be gentle.
Upvotes: 3
Views: 514
Reputation: 862471
You can use:
1.
groupby
+ size
for aggregting and unstack
for reshape:
df1 = df.groupby(["'Date'","'Color'"]).size().unstack(fill_value=0)
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
2.
pivot_table
solution:
df1 = df.pivot_table(index="'Date'",columns="'Color'", aggfunc='size')
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
3.
crosstab
solution, slowier:
df1 = pd.crosstab(df["'Date'"],df["'Color'"])
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
Upvotes: 8