Reputation: 374
I'm trying to determine the most efficient method to swap values in a pandas df. Using the df below, where Item
is equal to Left
I want to swap cols A-D
and B-C
. The df is quite large though (> 1,000,000 rows) so I'm wondering if it would be more efficient to subset out all values == Left
and just swap the col headers and then merge back with original df.
So:
A1 > D1
B1 > C1
C1 > B1
D1 > A1
A2 > D2
B2 > C2
C2 > B2
D2 > A2
import pandas as pd
df = pd.DataFrame({
'Item' : ['L','L','R','R','R'],
'A1' : ['A','B','C','D','E'],
'B1' : ['A','B','C','D','E'],
'C1' : [1,2,3,4,5],
'D1' : [1,2,3,4,5],
'A2' : ['A','B','C','D','E'],
'B2' : ['A','B','C','D','E'],
'C2' : [1,2,3,4,5],
'D2' : [1,2,3,4,5],
})
def swap(idf, cswap, condition):
idx = idf[condition].index
for original,newcol in cswap.items():
ovalues = idf.loc[idx, original].values
idf.loc[idx, original] = idf.loc[idx, newcol].values
idf.loc[idx, newcol] = ovalues
return idf
df = swap(df, {'A1': 'D1', 'B1': 'C1', 'A2': 'D2', 'B2': 'C2'}, df['Item'] == 'L')
Out:
Item A1 B1 C1 D1 A2 B2 C2 D2
0 L 1 A 1 A A A 1 1
1 L 2 B 2 B B B 2 2
2 R C C 3 3 C C 3 3
3 R D D 4 4 D D 4 4
4 R E E 5 5 E E 5 5
Intended Output:
Item A1 B1 C1 D1 A2 B2 C2 D2
0 L 1 1 A A 1 1 A A
1 L 2 2 B B 2 2 B B
2 R C C 3 3 C C 3 3
3 R D D 4 4 D D 4 4
4 R E E 5 5 E E 5 5
Upvotes: 0
Views: 66
Reputation: 1244
You can use the swap
function defined below, for the problem.
Inputs:
cswap
is the dictionary where (key, value)
represents the columns you want to swap.
condition
is True for rows
where, we want to execute the swap. e.g., Item == 'L'
In [21]: df
Out[21]:
Item A1 B1 C1 D1 A2 B2 C2 D2
0 L 1 1 2 2 2 2 1 1
1 L 1 1 2 2 2 2 1 1
2 R 1 1 2 2 2 2 1 1
3 R 1 1 2 2 2 2 1 1
4 R 1 1 2 2 2 2 1 1
In [18]: def swap(idf, cswap, condition):
...: idx = idf[condition].index
...: for original,newcol in cswap.items():
...: ovalues = idf.loc[idx, original].values
...: idf.loc[idx, original] = idf.loc[idx, newcol].values
...: idf.loc[idx, newcol] = ovalues
...: return idf
In [22]: swap(df, {'A1': 'D1', 'B1': 'C1', 'A2': 'D2', 'B2': 'C2'}, df['Item'] == 'L')
Out[22]:
Item A1 B1 C1 D1 A2 B2 C2 D2
0 L 2 2 1 1 1 1 2 2
1 L 2 2 1 1 1 1 2 2
2 R 1 1 2 2 2 2 1 1
3 R 1 1 2 2 2 2 1 1
4 R 1 1 2 2 2 2 1 1
Upvotes: 1