Reputation: 1471
I know that there are ways to swap the column order in python pandas. Let say I have this example dataset:
import pandas as pd
employee = {'EmployeeID' : [0,1,2],
'FirstName' : ['a','b','c'],
'LastName' : ['a','b','c'],
'MiddleName' : ['a','b', None],
'Contact' : ['(M) 133-245-3123', '(F)[email protected]', '(F)312-533-2442 [email protected]']}
df = pd.DataFrame(employee)
The one basic way to do would be:
neworder = ['EmployeeID','FirstName','MiddleName','LastName','Contact']
df=df.reindex(columns=neworder)
However, as you can see, I only want to swap two columns. It was doable just because there are only 4 column, but what if I have like 100 columns? what would be an effective way to swap or reorder columns?
There might be 2 cases:
Upvotes: 51
Views: 127043
Reputation: 21
Here's a two line solution which will work regardless of the size of the dataframe (no matter how many columns are there) as long as you know the names of the columns you want to swap. If the two columns are "col1" and "col2" in your dataframe (df):
df['col1'], df['col2'] = df['col2'].values, df['col1'].values
df = df.rename(columns={'col1': 'temp_col1', 'col2': 'col1', 'temp_col1': 'col2'})
Upvotes: 0
Reputation: 11
Positioning the pandas series according to need
#using pandas.iloc
df.iloc[:,[1,3,2,0]]
the first param of the pandas.iloc function is meant for rows, and the second param is meant for columns so we had given a list of order in which the columns has to be displayed.
Upvotes: 0
Reputation: 101
I think a function like this will be very useful to have control over the position of the columns:
def df_changeorder(frame: pd.DataFrame, var: list, remove=False, count_order='left', offset=0) -> pd.DataFrame:
"""
:param frame: dataframe
:param var: list of columns to move to the front
:param count_order: where to start counting from left or right to insert
:param offset: cols to skip in the count_order specified
:return: dataframe with order changed
"""
varlist = [w for w in frame.columns if w not in var]
if remove:
frame = frame[var]
else:
if offset == 0:
if count_order == 'left':
frame = frame[var + varlist]
if count_order == 'right':
frame = frame[varlist + var]
else:
if count_order == 'left':
frame = frame[varlist[:offset] + var + varlist[offset:]]
if count_order == 'right':
frame = frame[varlist[:-offset] + var + varlist[-offset:]]
return frame
A simple use case will be like defining the columns we want to reorder, for example, using the provided DataFrame, if we wanted to make this order:
['EmployeeID', 'Contact', 'LastName', 'FirstName', 'MiddleName']
Notice we only need to move Contact
and LastName
, therefore we can have that result easily as:
# columns to swap
swap_columns = ["Contact","LastName"]
# change the order
df = df_changeorder(df, swap_columns, count_order='left', offset=1)
With this approach we can reorder as many columns as we want, we just need to specify the list of columns and then apply the function as in the example.
Upvotes: 0
Reputation: 4032
A concise way to reorder columns when you don't have too many columns and don't want to list the column names is with .iloc[].
df_reorderd = df.iloc[:, [0, 1, 3, 2, 4]]
Upvotes: 3
Reputation: 31508
Columns can also be reordered when the dataframe is written out to a file (e.g. CSV):
df.to_csv('employees.csv',
columns=['EmployeeID','FirstName','MiddleName','LastName','Contact'])
Upvotes: 1
Reputation: 12930
If you want to have a fixed list of columns at the beginning, you could do something like
cols = ['EmployeeID','FirstName','MiddleName','LastName']
df = df[cols + [c for c in df.columns if c not in cols]]
This will put these 4 columns first and leave the rest untouched (without any duplicate column).
Upvotes: 20
Reputation: 131
When faced with same problem at larger scale, I came across a very elegant solution at this link: http://www.datasciencemadesimple.com/re-arrange-or-re-order-the-column-of-dataframe-in-pandas-python-2/ under the heading "Rearrange the column of dataframe by column position in pandas python".
Basically if you have the column order as a list, you can read that in as the new column order.
##### Rearrange the column of dataframe by column position in pandas python
df2=df1[df1.columns[[3,2,1,0]]]
print(df2)
In my case, I had a pre-calculated column linkage that determined the new order I wanted. If this order was defined as an array in L, then:
a_L_order = a[a.columns[L]]
Upvotes: 13
Reputation: 1325
Say your current order of column is [b,c,d,a] and you want to order it into [a,b,c,d], you could do it this way:
new_df = old_df[['a', 'b', 'c', 'd']]
Upvotes: 78
Reputation: 9081
Two column Swapping
cols = list(df.columns)
a, b = cols.index('LastName'), cols.index('MiddleName')
cols[b], cols[a] = cols[a], cols[b]
df = df[cols]
Reorder column Swapping (2 swaps)
cols = list(df.columns)
a, b, c, d = cols.index('LastName'), cols.index('MiddleName'), cols.index('Contact'), cols.index('EmployeeID')
cols[a], cols[b], cols[c], cols[d] = cols[b], cols[a], cols[d], cols[c]
df = df[cols]
Swapping Multiple
Now it comes down to how you can play with list slices -
cols = list(df.columns)
cols = cols[1::2] + cols[::2]
df = df[cols]
Upvotes: 37