Wojciech Moszczyński
Wojciech Moszczyński

Reputation: 3187

Summarize columns in crosstab in percentage structure of columns?

I have such problem:
This is file: https://archive.ics.uci.edu/ml/machine-learning-databases/autos/ I ought to summarize columns in precentage structure of columns but I can't. I get only sum of rows.
I do such code:

pd.crosstab(index=df.make, columns=df.body_style, normalize='columns',margins=True).applymap('{:.2f}%'.format)

I get only this: enter image description here

I need summarized of columns, not rows

Upvotes: 1

Views: 486

Answers (1)

jezrael
jezrael

Reputation: 862751

I believe you need normalize='index' parameter:

df = pd.crosstab(index=df.make, columns=df.body_style, normalize='index')
df['All'] = df.sum(axis=1)
print (df)

body_style     convertible   hardtop  hatchback     sedan     wagon  All
make                                                                    
alfa-romero       0.666667  0.000000   0.333333  0.000000  0.000000  1.0
audi              0.000000  0.000000   0.142857  0.714286  0.142857  1.0
bmw               0.000000  0.000000   0.000000  1.000000  0.000000  1.0
chevrolet         0.000000  0.000000   0.666667  0.333333  0.000000  1.0
dodge             0.000000  0.000000   0.555556  0.333333  0.111111  1.0
honda             0.000000  0.000000   0.538462  0.384615  0.076923  1.0
isuzu             0.000000  0.000000   0.250000  0.750000  0.000000  1.0
jaguar            0.000000  0.000000   0.000000  1.000000  0.000000  1.0
mazda             0.000000  0.000000   0.588235  0.411765  0.000000  1.0
mercedes-benz     0.125000  0.250000   0.000000  0.500000  0.125000  1.0
mercury           0.000000  0.000000   1.000000  0.000000  0.000000  1.0
mitsubishi        0.000000  0.000000   0.692308  0.307692  0.000000  1.0
nissan            0.000000  0.055556   0.277778  0.500000  0.166667  1.0
peugot            0.000000  0.000000   0.000000  0.636364  0.363636  1.0
plymouth          0.000000  0.000000   0.571429  0.285714  0.142857  1.0
porsche           0.200000  0.400000   0.400000  0.000000  0.000000  1.0
renault           0.000000  0.000000   0.500000  0.000000  0.500000  1.0
saab              0.000000  0.000000   0.500000  0.500000  0.000000  1.0
subaru            0.000000  0.000000   0.250000  0.416667  0.333333  1.0
toyota            0.031250  0.093750   0.437500  0.312500  0.125000  1.0
volkswagen        0.083333  0.000000   0.083333  0.750000  0.083333  1.0
volvo             0.000000  0.000000   0.000000  0.727273  0.272727  1.0

Or:

df = pd.crosstab(index=df.make, columns=df.body_style, normalize='columns')
df.loc['All'] = df.sum(axis=0)
print (df)

body_style     convertible  hardtop  hatchback     sedan  wagon
make                                                           
alfa-romero       0.333333    0.000   0.014286  0.000000   0.00
audi              0.000000    0.000   0.014286  0.052083   0.04
bmw               0.000000    0.000   0.000000  0.083333   0.00
chevrolet         0.000000    0.000   0.028571  0.010417   0.00
dodge             0.000000    0.000   0.071429  0.031250   0.04
honda             0.000000    0.000   0.100000  0.052083   0.04
isuzu             0.000000    0.000   0.014286  0.031250   0.00
jaguar            0.000000    0.000   0.000000  0.031250   0.00
mazda             0.000000    0.000   0.142857  0.072917   0.00
mercedes-benz     0.166667    0.250   0.000000  0.041667   0.04
mercury           0.000000    0.000   0.014286  0.000000   0.00
mitsubishi        0.000000    0.000   0.128571  0.041667   0.00
nissan            0.000000    0.125   0.071429  0.093750   0.12
peugot            0.000000    0.000   0.000000  0.072917   0.16
plymouth          0.000000    0.000   0.057143  0.020833   0.04
porsche           0.166667    0.250   0.028571  0.000000   0.00
renault           0.000000    0.000   0.014286  0.000000   0.04
saab              0.000000    0.000   0.042857  0.031250   0.00
subaru            0.000000    0.000   0.042857  0.052083   0.16
toyota            0.166667    0.375   0.200000  0.104167   0.16
volkswagen        0.166667    0.000   0.014286  0.093750   0.04
volvo             0.000000    0.000   0.000000  0.083333   0.12
All               1.000000    1.000   1.000000  1.000000   1.00

Upvotes: 2

Related Questions