Reputation: 175
I am trying to filter a dataframe based on the columns I have previously obtained from filtering the dataframe below.
AA BB CC DD EE FF GG
0 1 1 0 1 0 0
The dataframe is coming from a file where the data in each row is either a 0 or a 1 and will change based on the file that is loaded in. I have used the following code to filter this dataframe so that my output consists of only columns with a value of 1 in them.
with open('Factors.txt') as b:
IncludedFactors = pd.read_table(b, sep=',' )
print IncludedFactors
InterestingFactors = IncludedFactors.drop(IncludedFactors.columns[~IncludedFactors.iloc[0].astype(bool)],axis=1)
print InterestingFactors
output:
BB CC EE
1 1 1
I then need to filter out a bigger dataframe that has many headers, however I only need the ID, Xposition, Yposition and the headers of the InterestingFactors dataframe.
Below is the code I have tried, however the output still only consists of 3 headers instead of the 6 I need.
headers = InterestingFactors.columns.values
print headers
PivotTable = InfoTable.filter(items=['ID', 'Postion_X','Position_Y','headers'])
print PivotTable
Any help on how to do this correctly is greatly appreciated!
Upvotes: 0
Views: 95
Reputation: 7038
Here's one way you can do this:
headers = InterestingFactors.columns.append(pd.Index(['ID','Postion_X','Position_Y']))
PivotTable = InfoTable.loc[:, headers]
This combines the columns you're looking for from InterestingFactors
with the 3 columns you mention above. This Index
is passed to .loc[]
.
This also works:
headers = InterestingFactors.columns
PivotTable = InfoTable.loc[:, (pd.Index(['ID','Postion_X','Position_Y']) | headers)]
The datatypes for comparison (I believe) must be the same. Converting your list of 3 standard columns to pd.Index
will allow you to use |
within .loc[]
.
Upvotes: 1