lumenwrites
lumenwrites

Reputation: 1507

How do I "name" columns/rows in pandas DataFrame for clarity?

I'm trying to figure out how to "name" the rows and columns in my pandas DataFrame, for clarity. I'm not sure what it's called, but I'm trying to create a table like this: enter image description here

Is there an easy way to add "Actual class" on top of the column names, and "Predicted class" to the left of the row names, just for clarification?

Upvotes: 8

Views: 25833

Answers (3)

hendriksc1
hendriksc1

Reputation: 59

You could indeed create a multi-index:

In [1]: import pandas as pd
   ...: import numpy as np


In [2]: arrays = [['Cat','Dog','Rabbit']*3,
   ...:          ['Cat']*3+['Dog']*3+['Rabbit']*3]     

In [3]: tuples = list(zip(*arrays))

In [4]: index = pd.MultiIndex.from_tuples(tuples, names=['Predicted class', 'Actual class'])

In [5]: index
Out[5]: 
MultiIndex(levels=[['Cat', 'Dog', 'Rabbit'], ['Cat', 'Dog', 'Rabbit']],
           labels=[[0, 1, 2, 0, 1, 2, 0, 1, 2], [0, 0, 0, 1, 1, 1, 2, 2, 2]],
           names=['Predicted class', 'Actual class'])

In [6]: numbers = [5,3,0,2,3,1,0,2,11]

In [7]: data = pd.Series(numbers, index=index)

In [8]: df = pd.DataFrame(data.unstack('Actual class'))

In [9]: df
Out[9]: 
Actual class     Cat  Dog  Rabbit
Predicted class                  
Cat                5    2       0
Dog                3    3       2
Rabbit             0    1      11

Upvotes: 0

piRSquared
piRSquared

Reputation: 294488

Start with df

classes = ['Cat', 'Dog', 'Rabbit']
df = pd.DataFrame([[5, 2, 0], [3, 3, 2], [0, 1, 11]], classes, classes)
df

        Cat  Dog  Rabbit
Cat       5    2       0
Dog       3    3       2
Rabbit    0    1      11

pandas.concat

pd.concat(
    [pd.concat(
        [df],
        keys=['Actual Class'], axis=1)],
    keys=['Predicted Class']
)

                       Actual Class           
                                Cat Dog Rabbit
Predicted Class Cat               5   2      0
                Dog               3   3      2
                Rabbit            0   1     11

pandas.MultiIndex.from_product

Reconstruct

pd.DataFrame(
    df.values,
    pd.MultiIndex.from_product([['Predicted Class'], df.index]),
    pd.MultiIndex.from_product([['Actual Class'], df.columns])
)

                       Actual Class           
                                Cat Dog Rabbit
Predicted Class Cat               5   2      0
                Dog               3   3      2
                Rabbit            0   1     11

Upvotes: 10

Corentin Limier
Corentin Limier

Reputation: 5006

pd.DataFrame({
('Actual class', 'Cat'): {('Predicted class', 'Cat'): 5, ('Predicted class', 'Dog'): 2, ('Predicted class', 'Rabbit'): 0},
('Actual class', 'Dog'): {('Predicted class', 'Cat'): 3, ('Predicted class', 'Dog'): 3, ('Predicted class', 'Rabbit'): 2},
('Actual class', 'Rabbit'): {('Predicted class', 'Cat'): 0, ('Predicted class', 'Dog'): 1, ('Predicted class', 'Rabbit'): 11},
    })

enter image description here

Not sure it is a good idea, you create a MultiIndex just to clarify the representation of the dataframe as a string. You'll complexify the code for nothing useful.

Upvotes: 0

Related Questions