Reputation: 1507
I'm trying to figure out how to "name" the rows and columns in my pandas DataFrame, for clarity. I'm not sure what it's called, but I'm trying to create a table like this:
Is there an easy way to add "Actual class" on top of the column names, and "Predicted class" to the left of the row names, just for clarification?
Upvotes: 8
Views: 25833
Reputation: 59
You could indeed create a multi-index:
In [1]: import pandas as pd
...: import numpy as np
In [2]: arrays = [['Cat','Dog','Rabbit']*3,
...: ['Cat']*3+['Dog']*3+['Rabbit']*3]
In [3]: tuples = list(zip(*arrays))
In [4]: index = pd.MultiIndex.from_tuples(tuples, names=['Predicted class', 'Actual class'])
In [5]: index
Out[5]:
MultiIndex(levels=[['Cat', 'Dog', 'Rabbit'], ['Cat', 'Dog', 'Rabbit']],
labels=[[0, 1, 2, 0, 1, 2, 0, 1, 2], [0, 0, 0, 1, 1, 1, 2, 2, 2]],
names=['Predicted class', 'Actual class'])
In [6]: numbers = [5,3,0,2,3,1,0,2,11]
In [7]: data = pd.Series(numbers, index=index)
In [8]: df = pd.DataFrame(data.unstack('Actual class'))
In [9]: df
Out[9]:
Actual class Cat Dog Rabbit
Predicted class
Cat 5 2 0
Dog 3 3 2
Rabbit 0 1 11
Upvotes: 0
Reputation: 294488
Start with df
classes = ['Cat', 'Dog', 'Rabbit']
df = pd.DataFrame([[5, 2, 0], [3, 3, 2], [0, 1, 11]], classes, classes)
df
Cat Dog Rabbit
Cat 5 2 0
Dog 3 3 2
Rabbit 0 1 11
pandas.concat
pd.concat(
[pd.concat(
[df],
keys=['Actual Class'], axis=1)],
keys=['Predicted Class']
)
Actual Class
Cat Dog Rabbit
Predicted Class Cat 5 2 0
Dog 3 3 2
Rabbit 0 1 11
pandas.MultiIndex.from_product
Reconstruct
pd.DataFrame(
df.values,
pd.MultiIndex.from_product([['Predicted Class'], df.index]),
pd.MultiIndex.from_product([['Actual Class'], df.columns])
)
Actual Class
Cat Dog Rabbit
Predicted Class Cat 5 2 0
Dog 3 3 2
Rabbit 0 1 11
Upvotes: 10
Reputation: 5006
pd.DataFrame({
('Actual class', 'Cat'): {('Predicted class', 'Cat'): 5, ('Predicted class', 'Dog'): 2, ('Predicted class', 'Rabbit'): 0},
('Actual class', 'Dog'): {('Predicted class', 'Cat'): 3, ('Predicted class', 'Dog'): 3, ('Predicted class', 'Rabbit'): 2},
('Actual class', 'Rabbit'): {('Predicted class', 'Cat'): 0, ('Predicted class', 'Dog'): 1, ('Predicted class', 'Rabbit'): 11},
})
Not sure it is a good idea, you create a MultiIndex just to clarify the representation of the dataframe as a string. You'll complexify the code for nothing useful.
Upvotes: 0