kjo
kjo

Reputation: 35311

How to swap two DataFrame columns?

In MATLAB, to swap the first and second columns of a table A, one would do this1

A = A(:, [2 1 3:end]);

Is there a similarly convenient way to do this if A were a pandas DataFrame instead?

1 MATLAB uses 1-based indexing.

Upvotes: 68

Views: 152015

Answers (16)

SLDem
SLDem

Reputation: 2182

Found this solution and it worked perfectly for me, swaps any columns WITH their data:

import pandas as pd


df = pd.DataFrame({
    'name': ['Alice', 'Bobby', 'Carl', 'Dan'],
    'age': [29, 30, 31, 32],
    'salary': [175.1, 180.2, 190.3, 205.4],
})

column_names = list(df)

print(column_names)  # 👉️ ['name', 'age', 'salary']

column_names[1], column_names[2] = column_names[2], column_names[1]

print('-' * 50)

print(column_names)  # 👉️ ['name', 'salary', 'age']

df = df.loc[:, column_names]

print('-' * 50)

#     name  salary  age
# 0  Alice   175.1   29
# 1  Bobby   180.2   30
# 2   Carl   190.3   31
# 3    Dan   205.4   32
print(df)

Upvotes: 0

Mykola Zotko
Mykola Zotko

Reputation: 17824

You can remove a column and then insert it again. For example, to swap columns 'B' and 'C':

df = pd.DataFrame({'A': [1, 2, 3], 'B': [10, 20, 30], 'C': [100, 200, 300]})

df.insert(loc=df.columns.get_loc('B'), column='C', value=df.pop('C'))

Result:

   A    C   B
0  1  100  10
1  2  200  20
2  3  300  30

Upvotes: 0

Jordy
Jordy

Reputation: 1960

Easily, You can try using this way:

columns_titles = ["D","C","B","A"] # new ordering of columns
df = df[column_titles]

Upvotes: 0

Marawan Mamdouh
Marawan Mamdouh

Reputation: 1015

I create swapColumns function

It swaps the position of the two columns in the DataFrame and then renames the columns to reflect the swap.

def swapColumns(df, col1, col2):
    # Get the list of column names in the DataFrame
    cols = df.columns
    
    # Get the index of the two columns to swap
    col1_idx = cols.get_loc(col1)
    col2_idx = cols.get_loc(col2)
    
    # Swap the two columns using the index
    df[[cols[col1_idx], cols[col2_idx]]] = df[[cols[col2_idx], cols[col1_idx]]]
    
    # Rename the columns to reflect the swap
    df.rename(columns = {col1:col2, col2:col1}, inplace = True)

To use the function, you can pass in the DataFrame df and the two column names col1 and col2 like this :

swapColumns(df, 'longitude', 'median_house_value')

This will swap the position of the longitude and median_house_value columns in the DataFrame df.

Upvotes: 0

mirekphd
mirekphd

Reputation: 6781

A "one-liner" automating this popular answer (upvoted) with columns list inversion (but not in-place, hence .reverse() wasn't used):

reverse_df = df.reindex(columns=list(df.columns)[::-1])

As a bonus a unit test verifying that it worked:

assert (reverse_df.columns[::-1] == df.columns).all()

Upvotes: 0

Deepak Bisht
Deepak Bisht

Reputation: 9

colList = list(df.columns)
colList[0], colList[1] =  colList[1], colList[0]
df = df[colList]

Upvotes: 1

Umair Nasir
Umair Nasir

Reputation: 43

This works for me in Python 3.x:

df = df.iloc[:, [1, 0] + list(range(2, df.shape[1]))]

Remember df = P.iloc[:, [1, 0] + range(2, P.shape[1])] wont work and will give the error:

TypeError: can only concatenate list (not "range") to list

Upvotes: 1

Srini
Srini

Reputation: 21

column swap

import pandas as pd

df = pd.read_csv('/Users/parent/Desktop/Col_swap.csv')

print(df)

columns_titles = ["A","B","C","E"]

df_reorder=df.reindex(columns=columns_titles)

df_reorder.to_csv('/Users/parent/Desktop/col_reorder1.csv', index=False)

print(df_reorder)

Output:

    B   A   C   E
0  c1  a1  b1  d1
1  c2  a2  b2  d2


    A   B   C   E
0  a1  c1  b1  d1
1  a2  c2  b2  d2

Upvotes: 1

aspire57
aspire57

Reputation: 1547

pandas has reindex method that does it. You just need to give a list with the column names in the order you wish:

columns_titles = ["B","A"]
df=df.reindex(columns=columns_titles)

Cheers

Upvotes: 104

nityanarayan
nityanarayan

Reputation: 378

For Dataframes in python, Considering that you have given the 2 columns, then:

#df is your data frame

col1='c1'
col2='c2'
df = df[[col1 if col == col2 else col2 if col == col1 else col for col in df.columns]]

Upvotes: 2

Yingjian Pan
Yingjian Pan

Reputation: 161

In my case, I have over 100 columns in my data frame. So instead list all columns, I wrote a short function to just switch two columns

def df_column_switch(df, column1, column2):
    i = list(df.columns)
    a, b = i.index(column1), i.index(column2)
    i[b], i[a] = i[a], i[b]
    df = df[i]
    return df

Upvotes: 16

AturSams
AturSams

Reputation: 7952

I would use:

end = df.shape[1] # or len(df.columns)
df.iloc[:, np.r_[1, 0, 2:end]

Upvotes: 1

Luca Mastrostefano
Luca Mastrostefano

Reputation: 3271

If you have multiple columns and performance and memory are not an issue, you can simply use this function:

def swap_columns(df, c1, c2):
    df['temp'] = df[c1]
    df[c1] = df[c2]
    df[c2] = df['temp']
    df.drop(columns=['temp'], inplace=True)

Upvotes: 3

EdChum
EdChum

Reputation: 394031

A slight variant on acushner's answer:

# get a list of the columns
col_list = list(df)
# use this handy way to swap the elements
col_list[0], col_list[1] = col_list[1], col_list[0]
# assign back, the order will now be swapped
df.columns = col_list

example:

In [39]:

df = pd.DataFrame({'a':randn(3), 'b':randn(3), 'c':randn(3)})
df
Out[39]:
          a         b         c
0 -0.682446 -0.200654 -1.609470
1 -1.998113  0.806378  1.252384
2 -0.250359  3.774708  1.100771
In [40]:

col_list = list(df)
col_list[0], col_list[1] = col_list[1], col_list[0]
df.columns = col_list
df
Out[40]:
          b         a         c
0 -0.682446 -0.200654 -1.609470
1 -1.998113  0.806378  1.252384
2 -0.250359  3.774708  1.100771

UPDATE

If you just want to change the column order without changing the column contents then you can reindex using fancy indexing:

In [34]:
cols = list(df)
cols[1], cols[0] = cols[0], cols[1]
cols

Out[34]:
['b', 'a', 'c']

In [35]:
df.ix[:,cols]

Out[35]:
          b         a         c
0 -0.200654 -0.682446 -1.609470
1  0.806378 -1.998113  1.252384
2  3.774708 -0.250359  1.100771

Upvotes: 35

acushner
acushner

Reputation: 9946

c = A.columns
A = A[c[np.r_[1, 0, 2:len(c)]]]

or, even easier:

A[[c[0], c[1]]] = A[[c[1], c[0]]]

*edit: fixed per Ivan's suggestions.

Upvotes: 8

kjo
kjo

Reputation: 35311

I finally settled for this:

A = A.iloc[:, [1, 0] + range(2, A.shape[1])]

It's far less convenient than the MATLAB version, but I like the fact that it does not require creating temporary variables.

Upvotes: 4

Related Questions