Peter Lucas
Peter Lucas

Reputation: 1991

Create a Pivot table where my values are the count of my column

I have my dtatframe and would like to present the data where the vlaues in the pivot table are simply the count of the strings, which are my columns in the pivot table:

Sample of my df:

trading_book    state
A               Traded Away
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
B               Dealer Reject
C               Dealer Reject
A               Dealer Reject
D               Dealer Reject
D               Dealer Reject
E               Dealer Reject

Desired Result:

    Traded Away Dealer Reject   Done
Book            
A          1           2          0
B          0           2          0
C          0           2          0
D          0           2          0
E          0           1          0

When I tried this with the following code:

Count_Row = df.shape[0] #gives number of row count
Count_Col = df.shape[1] #gives number of col count
df_Sample = df[['trading_book','state']].head(Count_Row-1)
display(df_Sample)

display(pd.pivot_table(
                   df_Sample, 
                   index=['trading_book'],
                   columns=['state'], 
                   values='state',
                   aggfunc='count'
              ))

I get only the trading books displaying

What needs to be done with the values and aggfunc paramaters?

Upvotes: 1

Views: 880

Answers (3)

piRSquared
piRSquared

Reputation: 294278

You can use at categorical type column in a crosstab. By using categories. You are telling Pandas that this should be considered an option even if it doesn't show up in this particular data set.

states = 'Traded Away;Dealer Reject;Done'.split(';')
pd.crosstab(df.trading_book, pd.Categorical(df.state, states))

col_0         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0

Upvotes: 1

cs95
cs95

Reputation: 402523

Correcting your pivot_table code:

v = df.pivot_table(
         index='trading_book', 
         columns='state', 
         aggfunc='size', 
         fill_value=0
)

There's no need to specify a values parameter as long as you specify an aggfunc='size' argument. Next, to get your exact output, you'll need to reindex your dataframe along the columns:

v.reindex(columns=np.append(df.state.unique(), 'Done'), fill_value=0)

state         Traded Away  Dealer Reject  Done
trading_book                                  
A                       1              2     0
B                       0              2     0
C                       0              2     0
D                       0              2     0
E                       0              1     0

Alternatively, specify the new columns you want in a list:

cols = ['Done', ...]
v.assign(**dict.fromkeys(cols, 0))

state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0

Upvotes: 2

Allen Qin
Allen Qin

Reputation: 19947

You can use crosstab:

pd.crosstab(df.trading_book,df.state).assign(Done=0)
Out[266]: 
state         Dealer Reject  Traded Away  Done
trading_book                                  
A                         2            1     0
B                         2            0     0
C                         2            0     0
D                         2            0     0
E                         1            0     0

Upvotes: 1

Related Questions