Reputation: 29
I want to subtract the 3rd from the right column from all other columns except for itself and the last column, then create new columns with these values with the header including the original column's header. The column names and the number of columns between each file vary, but the names of the 2 most right columns are always the same.
Example df:
ROI005 | ROI008 | 53141 | AVG | ERR |
---|---|---|---|---|
2 | 5 | 1 | 4 | 1 |
4 | 2 | 2 | 3 | 3 |
3 | 3 | 1 | 5 | 2 |
Desired output:
ROI005 | ROI008 | 53141 | AVG | ERR | ROI005 - Background | ROI008 - Background | Average - Background |
---|---|---|---|---|---|---|---|
2 | 5 | 1 | 4 | 1 | 1 | 4 | 3 |
4 | 2 | 2 | 3 | 3 | 2 | 0 | 1 |
3 | 3 | 1 | 5 | 2 | 2 | 2 | 4 |
I've been using this to get the difference for the 3rd to last column using the below code. I know that I can use df.iloc[ : , : -3]
for all but the last 3 columns, but I'm not sure how to combine this with df.iloc[ : , -2]
for multiple columns.
import pandas as pd
import glob
files = glob.glob(r'C:\Users\Me\1Test' + '/*.txt')
for f in files:
df = pd.read_table(f)
df['Average - Background'] = df.iloc[ : , -2] - df.iloc[ : , -3]
df.to_excel(f.replace('txt', 'xlsx'), 'Sheet1')
Data:
{'ROI005': [2, 4, 3],
'ROI008': [5, 2, 3],
'53141': [1, 2, 1],
'AVG': [4, 3, 5],
'ERR': [1, 3, 2]}
Upvotes: 0
Views: 459
Reputation:
We could subtract that particular column from the other columns horizontally using sub
on axis=0
; then join
it back to df
:
out = (df.join(df.drop(df.columns[[-3,-1]], axis=1)
.sub(df[df.columns[-3]], axis=0)
.add_suffix(' - Background')))
Output:
ROI005 ROI008 53141 AVG ERR ROI005 - Background ROI008 - Background AVG - Background
0 2 5 1 4 1 1 4 3
1 4 2 2 3 3 2 0 1
2 3 3 1 5 2 2 2 4
Upvotes: 2