agorapotatoes
agorapotatoes

Reputation: 351

Reference pandas index by name in multi-index dataset

I have a pandas dataframe that I have converted from an xarray dataset, which has three indices.

                                         XTIME  windspeed  direction  \
Time lat        lon                                                    
0    -36.457836 146.200153 2017-10-31 01:00:00   6.299611   0.853234   
                146.208487 2017-10-31 01:00:00   6.479416   0.794886   
                146.216820 2017-10-31 01:00:00   6.661458   0.736131   
                146.225153 2017-10-31 01:00:00   6.511693   0.702716   
                146.233487 2017-10-31 01:00:00   6.218874   0.725511   
...                                        ...        ...        ...   
463  -37.599502 147.958486 2017-11-19 08:00:00   2.083699  -1.161645   
                147.966819 2017-11-19 08:00:00   2.202758  -1.171001   
                147.975153 2017-11-19 08:00:00   2.367172  -1.163191   
                147.983486 2017-11-19 08:00:00   2.580872  -1.099003   
                147.991819 2017-11-19 08:00:00   2.817507  -1.016175   

I want to be able to reference the lat and lon indices and extract the numbers stored there. However, I don't know how to reference them by name.

A similar question was asked here but the accepted answer is simply for the OP to keep a copy of the column when indexing it, which does not help in my situation.

Upvotes: 1

Views: 64

Answers (1)

jezrael
jezrael

Reputation: 862511

I think simpliest is convert MultiIndex to columns by DataFrame.reset_index, after select get Series:

df = df.reset_index()
print (df)
   Time        lat         lon                XTIME  windspeed  direction
0     0 -36.457836  146.200153  2017-10-31 01:00:00   6.299611   0.853234
1     0 -36.457836  146.208487  2017-10-31 01:00:00   6.479416   0.794886
2     0 -36.457836  146.216820  2017-10-31 01:00:00   6.661458   0.736131
3     0 -36.457836  146.225153  2017-10-31 01:00:00   6.511693   0.702716
4     0 -36.457836  146.233487  2017-10-31 01:00:00   6.218874   0.725511
5   463 -37.599502  147.958486  2017-11-19 08:00:00   2.083699  -1.161645
6   463 -37.599502  147.966819  2017-11-19 08:00:00   2.202758  -1.171001
7   463 -37.599502  147.975153  2017-11-19 08:00:00   2.367172  -1.163191
8   463 -37.599502  147.983486  2017-11-19 08:00:00   2.580872  -1.099003
9   463 -37.599502  147.991819  2017-11-19 08:00:00   2.817507  -1.016175

time = df['Time']
lat = df['lat']
lon = df['lon']
print (lon)
0    146.200153
1    146.208487
2    146.216820
3    146.225153
4    146.233487
5    147.958486
6    147.966819
7    147.975153
8    147.983486
9    147.991819
Name: lon, dtype: float64

Or you can select each level in MultiIndex by Index.get_level_values - after select get indices:

time = df.index.get_level_values('Time')
lat = df.index.get_level_values('lat')
lon = df.index.get_level_values('lon')

print (lon)
Float64Index([        146.200153, 146.20848700000002, 146.21681999999998,
                      146.225153, 146.23348700000003, 147.95848600000002,
                      147.966819,         147.975153,         147.983486,
                      147.991819],
             dtype='float64', name='lon')

Upvotes: 1

Related Questions