Reputation: 1889
I have a data frame with a class value I am trying to predict. I am interested in label 1.
I am trying to determine if turn plays a role for a given key value.
For a given key value of say 1 and a turn number of 1, what percentage of turns have a class value of 1?
For example for the given data
key=1,turn=1,8/11 have a class label 1
key=1,turn=2,5/6 have a class label 1
How can I plot a percentage histogram for this type of data? I know a normal histogram using matplotlib
import matplotlib
matplotlib.use('PS')
import matplotlib.pyplot as plt
plt.hist()
but what values I would use to get the percentage histogram?
Sample columns from the dataframe
key=[ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ]
turn=[ 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 1 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4]
class=[0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 1 1 1 0 0]
Upvotes: 0
Views: 806
Reputation: 339705
Since the concepts from the linked question are apparently not what you need, an alternative would be to produce pie charts as shown below.
key=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ]
turn=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
clas=[0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0]
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.DataFrame({"key":key, "turn":turn, "class":clas})
piv = pd.pivot_table(df, values="class", index="key", columns="turn")
print piv
fig, axes = plt.subplots(ncols=4, nrows=2)
for i in range(2):
axes[i,0].set_ylabel("key {}".format(i+1))
for j in range(4):
pie = axes[i,j].pie([piv.values[i,j],1.-piv.values[i,j]], autopct="%.1f%%")
axes[i,j].set_aspect("equal")
axes[0,j].set_title("turn {}".format(j+1))
plt.legend(pie[0],["class 1","class 0"], bbox_to_anchor=(1,0.5), loc="right",
bbox_transform=plt.gcf().transFigure)
plt.show()
Upvotes: 1