Nyxynyx
Nyxynyx

Reputation: 63687

KeyError when accessing Pandas DataFrame using MultiIndex

A pandas.DataFrame have it's MultiIndex created as shown:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three three two four three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})

df = df.set_index(['A','B'])

This creates a `DataFrame``:

           C   D
A   B           
foo one    0   0
bar one    1   2
foo two    2   4
bar three  3   6
foo three  4   8
bar two    5  10
foo four   6  12
    three  7  14

Problem: Why do you get a KeyError: 'foo' when trying to select using df['foo']? Similarly, df['foo', 'one'] and df['foo']['one'].

Furthermore,the MultiIndex did not group all the foos together under A? Is it necessary to group them together, like :

          A         B
one 1 -0.732470 -0.313871
    2 -0.031109 -2.068794
    3  1.520652  0.471764
two 1 -0.101713 -1.204458
    2  0.958008 -0.455419
    3 -0.191702 -0.915983

Upvotes: 0

Views: 4644

Answers (1)

DeepSpace
DeepSpace

Reputation: 81684

df['foo'] tries to select column foo and thus generates KeyError as there is no foo column. I guess you meant to do df.loc['foo'] and df.loc['foo', 'one'].

Upvotes: 3

Related Questions