Reputation: 1371
When I read in a CSV, I can say pd.read_csv('my.csv', index_col=3)
and it sets the third column as index.
How can I do the same if I have a pandas dataframe in memory? And how can I say to use the first row also as an index? The first column and row are strings, rest of the matrix is integer.
Upvotes: 100
Views: 268955
Reputation: 1513
You can try this regardless of the number of rows
df = pd.read_csv('data.csv', index_col=0)
Upvotes: 110
Reputation: 6861
Making the first (or n-th) column the index in increasing order of verboseness:
df.set_index(list(df)[0])
df.set_index(df.columns[0])
df.set_index(df.columns.tolist()[0])
Making the first (or n-th) row the index:
df.set_index(df.iloc[0].values)
You can use both if you want a multi-level index:
df.set_index([df.iloc[0], df.columns[0]])
Observe that using a column as index will automatically drop it as column. Using a row as index is just a copy operation and won't drop the row from the DataFrame.
Upvotes: 45