Sid
Sid

Reputation: 4055

How to select a level and all sub levels(and items) in pandas?

I have a multilevel dataframe:

year month  Open      High      Low       Close
2018 1      25565.75  27379.45  27652.05  25232.80
     2      27334.85  25107.40  27613.50  24781.70
     3      25022.35  24263.35  25226.05  23605.60
     4      24235.65  25531.60  25617.50  24062.95
     5      25607.05  26956.20  27164.55  25469.05
2019 1      27231.40  27720.40  27754.45  26923.75

I am trying to get the 2018 type (int64) selected and made into a separate dataframe.

I have tried:

df[2017]

I get a key error

df.loc[pd.IndexSlice[0,'Open']]

This gets me all Open prices.

df.loc[pd.IndexSlice['year':,'Open']]

df.loc[pd.IndexSlice['2017','Open']]

They don't seem to work.

How to access the levels? Keeping in mind they are int64 type?

Output expected:

     month  Open      High      Low       Close
     1      25565.75  27379.45  27652.05  25232.80
     2      27334.85  25107.40  27613.50  24781.70
     3      25022.35  24263.35  25226.05  23605.60
     4      24235.65  25531.60  25617.50  24062.95
     5      25607.05  26956.20  27164.55  25469.05

Also if there was another level how would I go about accessing two levels down data?

Thanks

Upvotes: 1

Views: 293

Answers (2)

It_is_Chris
It_is_Chris

Reputation: 14113

you can also use iloc with get_level_values:

2018_df = df.iloc[df.index.get_level_values('year') == 2018].reset_index(level=0, drop=True)

       Open        High       Low        Close
month               
1   25565.75    27379.45    27652.05    25232.80
2   27334.85    25107.40    27613.50    24781.70
3   25022.35    24263.35    25226.05    23605.60
4   24235.65    25531.60    25617.50    24062.95
5   25607.05    26956.20    27164.55    25469.05

or if you want to access another level:

month_1_df = df.iloc[df.index.get_level_values('month') == 1].reset_index(level=1, drop=True)

           Open        High         Low        Close
year                
2018    25565.75    27379.45    27652.05    25232.80
2019    27231.40    27720.40    27754.45    26923.75

or to access more than one level use transpose:

df.T.loc[:, (2018, 3)].to_frame().T

             Open         High        Low        Close
2018    3   25022.35    24263.35    25226.05    23605.6

now you can select any column(s) you want: df.T.loc[:, (2018, 3)].to_frame().T[['Open', 'Close']]

Upvotes: 1

jezrael
jezrael

Reputation: 863291

You want seelct index, so need loc:

df = df.loc[2018]

Or use DataFrame.xs:

df = df.xs(2018)

Or if want select all values in second level need : :

df = df.loc[pd.IndexSlice[2018,:]]
print (df)
           Open      High       Low     Close
month                                        
1      25565.75  27379.45  27652.05  25232.80
2      27334.85  25107.40  27613.50  24781.70
3      25022.35  24263.35  25226.05  23605.60
4      24235.65  25531.60  25617.50  24062.95
5      25607.05  26956.20  27164.55  25469.05

Upvotes: 2

Related Questions