mdicrist
mdicrist

Reputation: 175

Replacing values in a row in a dataframe based on values from another dataframe (Python)

I have 2 dataframes (Thresholds and InfoTable) were the first row is a row of headers:

(Thresholds)

AA BB CC DD EE 
0  15 7  0  23

and

(InfoTable)

ID Xposition Yposition AA BB CC DD EE
1  1         1         10 20 5  10 50
2  2         2         20 12 10 20 2
3  3         3         30 19 17 30 26
4  4         4         40 35 3  40 38
5  5         5         50 16 5  50 16

I am trying to filter the data so that the columns in the Thresholds dataframe that contain a 0 are columns that are removed from the InfoTable dataframe. I am then trying to compare the values of each row in the Thresholds dataframe to the values in the InfoTable dataframe so that They can be replaced with either a 1 or a 0 in the Infotable. My desired output is below:

ID Xposition Yposition BB CC EE
1  1         1         1  0  1
2  2         2         0  1  0
3  3         3         1  1  1
4  4         4         1  0  1
5  5         5         1  0  0

This is the code that I have right now to filter each table.

with open('thresholds_test.txt' ) as a:
    Thresholds = pd.read_table(a, sep=',') 
print Thresholds 

with open('includedThresholds.txt') as b:
    IncludedThresholds = pd.read_table(b, sep=',' )
print IncludedThresholds

InterestingThresholds = IncludedThresholds.drop(IncludedThresholds.columns[~IncludedThresholds.iloc[0].astype(bool)],axis=1)
print InterestingThresholds 

with open('PivotTable.tab') as c:
    PivotTable = pd.read_table(c, sep='\t' )
print PivotTable

headers = InterestingThresholds.columns.append(pd.Index(['ID','XPostion','YPosition']))
InfoTable = PivotTable.loc[:, headers]
print InfoTable 

Any help would be greatly appreciated!

Upvotes: 1

Views: 147

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Find columns to keep and drop:

cols = Thresholds.columns[Thresholds.iloc[0].astype(bool)]
dcols = Thresholds.columns[~Thresholds.iloc[0].astype(bool)]

Do comparison:

comp_df = pd.DataFrame(InfoTable[cols].values >= Thresholds[cols].values, columns=cols).astype(int)

Assign comparison result to original dataframe and drop columns:

df_out = InfoTable.assign(**comp_df).drop(dcols, axis=1)
print(df_out)

Output:

  ID  Xposition  Yposition  BB  CC  EE
0   1          1          1   1   0   1
1   2          2          2   0   1   0
2   3          3          3   1   1   1
3   4          4          4   1   0   1
4   5          5          5   1   0   0

Upvotes: 1

Related Questions