Reputation: 9762
In pandas tables, row-index and column-index have a very similar interface and some operations allow to operate along either rows and columns simply by a parameter axis
. (For example sort_index
, and many more.)
But how can I access (read and write) either row-index or column-index by specifying the axis?
# Instead of this
if axis==0:
table.index = some_function(table.get_index_by_axis(axis))
else:
table.column = some_function(table.get_index_by_axis(axis))
# I would like to simply write:
newIndex = some_function(table.get_index_by_axis(axis))
table.set_index_by_axis(newIndex, axis=axis)
Does something like get_index_by_axis
and set_index_by_axis
exist?
Update:
Data frames have an attribute axes
that permits to choose the axis by index. However, this is read-only. Assigning a new value does not have an effect on the table.
index = table.axes[axis] # Read an index
newIndex = some_function(index)
table.axes[axis] = newIndex # This has no effect on table.
Upvotes: 1
Views: 703
Reputation: 9762
import pandas as pd
def apply_axis(df, axis, func):
old_index = df.axes[axis]
new_index = old_index.map(func)
df = df.set_axis(new_index, axis=axis)
return df
def some_function(x):
return x+x
df = pd.DataFrame({'a': [1,2,3],
'b': [10,20,30],
'c': [100,200,300],
'd': [1000,2000,3000]})
# a b c d
# 0 1 10 100 1000
# 1 2 20 200 2000
# 2 3 30 300 3000
ret = apply_axis(df=df, axis=0, func=some_function)
# a b c d
# 0 1 10 100 1000
# 2 2 20 200 2000
# 4 3 30 300 3000
ret = apply_axis(df=df, axis=1, func=some_function)
# aa bb cc dd
# 0 1 10 100 1000
# 1 2 20 200 2000
# 2 3 30 300 3000
Upvotes: 0
Reputation: 1308
I looked into the pandas source code to see how the axis keyword is used. There's a method _get_axis_name
that takes the axis as a parameter.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
Pass in the axis parameter:
>>> df._get_axis_name(axis=0)
'index'
>>> df._get_axis_name(axis=1)
'columns'
You can use this with getattr
or setattr
.
>>> getattr(df, df._get_axis_name(axis=0))
RangeIndex(start=0, stop=3, step=1)
>>> getattr(df, df._get_axis_name(axis=1))
Index(['A', 'B'], dtype='object')
Upvotes: 2