Reputation: 175
I have 2 dataframes (Thresholds and InfoTable) were the first row is a row of headers:
(Thresholds)
AA BB CC DD EE
0 15 7 0 23
and
(InfoTable)
ID Xposition Yposition AA BB CC DD EE
1 1 1 10 20 5 10 50
2 2 2 20 12 10 20 2
3 3 3 30 19 17 30 26
4 4 4 40 35 3 40 38
5 5 5 50 16 5 50 16
I am trying to filter the data so that the columns in the Thresholds dataframe that contain a 0 are columns that are removed from the InfoTable dataframe. I am then trying to compare the values of each row in the Thresholds dataframe to the values in the InfoTable dataframe so that They can be replaced with either a 1 or a 0 in the Infotable. My desired output is below:
ID Xposition Yposition BB CC EE
1 1 1 1 0 1
2 2 2 0 1 0
3 3 3 1 1 1
4 4 4 1 0 1
5 5 5 1 0 0
This is the code that I have right now to filter each table.
with open('thresholds_test.txt' ) as a:
Thresholds = pd.read_table(a, sep=',')
print Thresholds
with open('includedThresholds.txt') as b:
IncludedThresholds = pd.read_table(b, sep=',' )
print IncludedThresholds
InterestingThresholds = IncludedThresholds.drop(IncludedThresholds.columns[~IncludedThresholds.iloc[0].astype(bool)],axis=1)
print InterestingThresholds
with open('PivotTable.tab') as c:
PivotTable = pd.read_table(c, sep='\t' )
print PivotTable
headers = InterestingThresholds.columns.append(pd.Index(['ID','XPostion','YPosition']))
InfoTable = PivotTable.loc[:, headers]
print InfoTable
Any help would be greatly appreciated!
Upvotes: 1
Views: 147
Reputation: 153460
Find columns to keep and drop:
cols = Thresholds.columns[Thresholds.iloc[0].astype(bool)]
dcols = Thresholds.columns[~Thresholds.iloc[0].astype(bool)]
Do comparison:
comp_df = pd.DataFrame(InfoTable[cols].values >= Thresholds[cols].values, columns=cols).astype(int)
Assign comparison result to original dataframe and drop columns:
df_out = InfoTable.assign(**comp_df).drop(dcols, axis=1)
print(df_out)
Output:
ID Xposition Yposition BB CC EE
0 1 1 1 1 0 1
1 2 2 2 0 1 0
2 3 3 3 1 1 1
3 4 4 4 1 0 1
4 5 5 5 1 0 0
Upvotes: 1