Reputation: 2301
I have the following Pandas DataFrame, but am having trouble updating a column header value, or easily accessing the header values (for example, for plotting a time at the (lon,lat) location from the header).
df = pd.DataFrame(columns = ["id0", "id1", "id2"])
df.loc[2012]= [24, 25, 26]
df.loc[2013]= [28, 28, 29]
df.loc[2014]= [30, 31, 32]
df.columns = pd.MultiIndex.from_arrays([df.columns, [66,67,68], [110,111,112]],
names=['id','lat','lon'])
Which then looks like this:
>>> df
id id0 id1 id2
lat 66 67 68
lon 110 111 112
2012 24.0 25.0 26.0
2013 28.0 28.0 29.0
2014 30.0 31.0 32.0
I'd like to be able to adjust the latitude or longitude for df['id0']
, or plot(df.ix[2014])
but at (x,y)
location based on (lon,lat)
.
Upvotes: 4
Views: 11388
Reputation: 11602
You can use df.columns.get_level_values('lat')
in order to get the index object. This returns a copy of the index, so you cannot extend this approach to modify the coordinates inplace.
However, you can access the levels directly and modify them inplace using this workaround.
import pandas as pd
import numpy as np
df = pd.DataFrame(columns = ["id0", "id1", "id2"])
df.loc[2012]= [24, 25, 26]
df.loc[2013]= [28, 28, 29]
df.loc[2014]= [30, 31, 32]
df.columns = pd.MultiIndex.from_arrays([df.columns, [66,67,68], [110,111,112]],
names=['id','lat','lon'])
ids = df.columns.get_level_values('id')
id_ = 'id0'
column_position = np.where(ids.values == id_)
new_lat = 90
new_lon = 0
df.columns._levels[1].values[column_position] = new_lat
df.columns._levels[2].values[column_position] = new_lon
Upvotes: 3
Reputation: 294218
You access MultiIndex
via tuples. For example:
df.loc[:, ('id0', 66, 110)]
However, you may want to access via lon/lat without specifying id or maybe you'll have multiple ids. In that case, you can do 2 things.
First, use pd.IndexSlice
which allows for useful MultiIndex
slicing:
df.loc[:, pd.IndexSlice[:, 66, 110]]
Second:
df.stack(0).loc[:, (66, 110)].dropna().unstack()
Which is messier, but might be useful.
Finally, the last thing you mentioned. For a specific row with lon/lat.
df.loc[2014, pd.IndexSlice[:, 66, 110]]
Upvotes: 2