Reputation: 559
For example:
We have a Pandas dataFrame foo with 2 columns ['A', 'B'].
I want to do function like
foo.set_index([0,1])
instead of
foo.set_index(['A', 'B'])
Have tried foo.set_index([[0,.1]])
as well but came with this error:
Length mismatch: Expected axis has 9 elements, new values have 2 elements
Upvotes: 11
Views: 23219
Reputation: 5292
This worked for me, the other answer didn't.
# single column
df.set_index(df.columns[1])
# multi column
df.set_index(df.columns[[1, 0]].tolist())
Upvotes: 2
Reputation: 880877
If the column index is unique you could use:
df.set_index(list(df.columns[cols]))
where cols
is a list of ordinal indices.
For example,
In [77]: np.random.seed(2016)
In [79]: df = pd.DataFrame(np.random.randint(10, size=(5,4)), columns=list('ABCD'))
In [80]: df
Out[80]:
A B C D
0 3 7 2 3
1 8 4 8 7
2 9 2 6 3
3 4 1 9 1
4 2 2 8 9
In [81]: df.set_index(list(df.columns[[0,2]]))
Out[81]:
B D
A C
3 2 7 3
8 8 4 7
9 6 2 3
4 9 1 1
2 8 2 9
If the DataFrame's column index is not unique, then setting the index by label is impossible and by ordinals more complicated:
import numpy as np
import pandas as pd
np.random.seed(2016)
def set_ordinal_index(df, cols):
columns, df.columns = df.columns, np.arange(len(df.columns))
mask = df.columns.isin(cols)
df = df.set_index(cols)
df.columns = columns[~mask]
df.index.names = columns[mask]
return df
df = pd.DataFrame(np.random.randint(10, size=(5,4)), columns=list('AAAA'))
print(set_ordinal_index(df, [0,2]))
yields
A A
A A
3 2 7 3
8 8 4 7
9 6 2 3
4 9 1 1
2 8 2 9
Upvotes: 16