Joylove
Joylove

Reputation: 414

pandas.core.indexing.IndexingError: Too many indexers with NaN in multi index

Consider the following dataframe

my_df = pd.DataFrame()
my_df.at[0,'tunnel1']=3
my_df.at[1,'tunnel1']=3
my_df.at[1,'tunnel2']=2
my_df.at[2,'tunnel1']=3
my_df.at[2,'tunnel2']=2
my_df.at[3,'tunnel1']=4
my_df.at[3,'tunnel2']=1
my_df.at[3,'tunnel3']=4
my_df.at[4,'tunnel1']=1
my_df.at[4,'tunnel2']=5
my_df.at[4,'tunnel3']=1
my_df.at[5,'tunnel1']=1
my_df.at[5,'tunnel2']=5
my_df.at[5,'tunnel3']=1
my_df.at[5,'tunnel4']=3
my_df.at[6,'tunnel1']=6
my_df.at[6,'tunnel2']=5
my_df.at[6,'tunnel3']=5
my_df.at[6,'tunnel4']=2
my_df['data1']='ham'
my_df['data2']='eggs'
my_df['data3']='coffee'

df looks like

   tunnel1  tunnel2  tunnel3  tunnel4 data1 data2   data3
0      3.0      NaN      NaN      NaN   ham  eggs  coffee
1      3.0      2.0      NaN      NaN   ham  eggs  coffee
2      3.0      2.0      NaN      NaN   ham  eggs  coffee
3      4.0      1.0      4.0      NaN   ham  eggs  coffee
4      1.0      5.0      1.0      NaN   ham  eggs  coffee
5      1.0      5.0      1.0      3.0   ham  eggs  coffee
6      6.0      5.0      5.0      2.0   ham  eggs  coffee

Then set a multiindex

my_df = my_df.set_index(['tunnel1', 'tunnel2', 'tunnel3', 'tunnel4'])

Looks like

                               data1 data2   data3
tunnel1 tunnel2 tunnel3 tunnel4                    
3.0     NaN     NaN     NaN       ham  eggs  coffee
        2.0     NaN     NaN       ham  eggs  coffee
                        NaN       ham  eggs  coffee
4.0     1.0     4.0     NaN       ham  eggs  coffee
1.0     5.0     1.0     NaN       ham  eggs  coffee
                        3.0       ham  eggs  coffee
6.0     5.0     5.0     2.0       ham  eggs  coffee

Now I want to slice it so that get rows for each unique entry of the multiindex

for configuration in my_df.index.unique():
            mini_df=my_df.loc[configuration] 

pandas.core.indexing.IndexingError: Too many indexers

First index slider is

configuration
(3.0, nan, nan, nan)

And this i believe is causing the error.

What I want from my loop is

mini_df

   tunnel1  tunnel2  tunnel3  tunnel4 data1 data2   data3
0      3.0      NaN      NaN      NaN   ham  eggs  coffee

mini_df'

   tunnel1  tunnel2  tunnel3  tunnel4 data1 data2   data3
1      3.0      2.0      NaN      NaN   ham  eggs  coffee
2      3.0      2.0      NaN      NaN   ham  eggs  coffee

mini_df''

   tunnel1  tunnel2  tunnel3  tunnel4 data1 data2   data3
3      4.0      1.0      4.0      NaN   ham  eggs  coffee

mini_df'''

   tunnel1  tunnel2  tunnel3  tunnel4 data1 data2   data3
4      1.0      5.0      1.0      NaN   ham  eggs  coffee

Any suggestions on what to try here please? Thanks for your help in advance.

Upvotes: 0

Views: 243

Answers (2)

ansev
ansev

Reputation: 30920

Use DataFrame.xs + Index.get_level_values:

for id1 in my_df.index.get_level_values(0).unique():
    print(my_df.xs(id1))

You couls save the dataframes in a dict:

df_id1={id1:my_df.xs(id1) for id1 in my_df.index.get_level_values(0).unique()}
for key in df_id1:
    print(f'df_id1[{key}]')
    print('-'*50)
    print(df_id1[key])

df_id1[3.0]
--------------------------------------------------
                        data1 data2   data3
tunnel2 tunnel3 tunnel4                    
NaN     NaN     NaN       ham  eggs  coffee
2.0     NaN     NaN       ham  eggs  coffee
                NaN       ham  eggs  coffee
df_id1[4.0]
--------------------------------------------------
                        data1 data2   data3
tunnel2 tunnel3 tunnel4                    
1.0     4.0     NaN       ham  eggs  coffee
df_id1[1.0]
--------------------------------------------------
                        data1 data2   data3
tunnel2 tunnel3 tunnel4                    
5.0     1.0     NaN       ham  eggs  coffee
                3.0       ham  eggs  coffee
df_id1[6.0]
--------------------------------------------------
                        data1 data2   data3
tunnel2 tunnel3 tunnel4                    
5.0     5.0     2.0       ham  eggs  coffee

We can also use DataFrame.groupby:

for i, group in my_df.groupby(level=0):
#for i, group in my_df.groupby('tunnel1'): #latest versions of pandas
    print(group)

                                data1 data2   data3
tunnel1 tunnel2 tunnel3 tunnel4                    
1.0     5.0     1.0     NaN       ham  eggs  coffee
                        3.0       ham  eggs  coffee
                                data1 data2   data3
tunnel1 tunnel2 tunnel3 tunnel4                    
3.0     NaN     NaN     NaN       ham  eggs  coffee
        2.0     NaN     NaN       ham  eggs  coffee
                        NaN       ham  eggs  coffee
                                data1 data2   data3
tunnel1 tunnel2 tunnel3 tunnel4                    
4.0     1.0     4.0     NaN       ham  eggs  coffee
                                data1 data2   data3
tunnel1 tunnel2 tunnel3 tunnel4                    
6.0     5.0     5.0     2.0       ham  eggs  coffee

Upvotes: 1

BENY
BENY

Reputation: 323306

Why not try replace or fillna the NaN with string 'NaN'

my_df = my_df.fillna('NaN').set_index(['tunnel1', 'tunnel2', 'tunnel3', 'tunnel4'])

Upvotes: 2

Related Questions