AMisra
AMisra

Reputation: 1889

percentage histogram with matplotlib, One input to the axis is a combination of two columns

I have a data frame with a class value I am trying to predict. I am interested in label 1.
I am trying to determine if turn plays a role for a given key value. For a given key value of say 1 and a turn number of 1, what percentage of turns have a class value of 1?

For example for the given data

key=1,turn=1,8/11 have a class label 1
key=1,turn=2,5/6 have a class label 1

How can I plot a percentage histogram for this type of data? I know a normal histogram using matplotlib

import matplotlib
matplotlib.use('PS')
import matplotlib.pyplot as plt
plt.hist()

but what values I would use to get the percentage histogram?

Sample columns from the dataframe

key=[ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ]

turn=[ 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4]

class=[0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 1 1 1 0 0]

Upvotes: 0

Views: 806

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339705

Since the concepts from the linked question are apparently not what you need, an alternative would be to produce pie charts as shown below.

enter image description here

key=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ]
turn=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
clas=[0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0]

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df=pd.DataFrame({"key":key, "turn":turn, "class":clas})
piv = pd.pivot_table(df, values="class", index="key", columns="turn")
print piv
fig, axes = plt.subplots(ncols=4, nrows=2)
for i in range(2):
    axes[i,0].set_ylabel("key {}".format(i+1))
    for j in range(4):
        pie = axes[i,j].pie([piv.values[i,j],1.-piv.values[i,j]], autopct="%.1f%%")
        axes[i,j].set_aspect("equal")
        axes[0,j].set_title("turn {}".format(j+1))

plt.legend(pie[0],["class 1","class 0"], bbox_to_anchor=(1,0.5), loc="right", 
                          bbox_transform=plt.gcf().transFigure)       
plt.show()

Upvotes: 1

Related Questions