Reputation: 35
I am trying to get the columns from dataframe whose correlation with another column is greater than certain values like below.
df.loc[:, (df.corr()['col'] <= -0.05) | (df.corr()['col'] >= 0.05)]
But I am getting below error,
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
Also if I try to select the columns with variance > 1, I get the same error,
df.loc[;df.var() > 1 ].
Why I am getting indexing error. I want to filter the columns of dataframe if correlation of that column with another columns is between -0.05 and 0.05.
Can someone assist in resolving this issue. I am not sure where I am going wrong
Upvotes: 2
Views: 5893
Reputation: 211
I think I found what's your problem.
First I tried to build my own testing set, unfortunately everything worked nicely:
df = pd.DataFrame({
"col": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0],
"A": [1.1, 1.0, 1.0, 1.0, 1.0, 1.1],
"B": [1.0, 2.1, 3.0, 3.9, 5.0, 6.0]
})
df.loc[:, (df.corr()['col'] <= -0.05) | (df.corr()['col'] >= 0.05)]
I got :
col B
0 1.0 1.0
1 2.0 2.1
2 3.0 3.0
3 4.0 3.9
4 5.0 5.0
5 6.0 6.0
But then, after reading again your error, I thought maybe there are some columns in your data the corr()
method is just ignoring such as column with an object
dtype.
If I build a new testing set with textual columns, I get the same error as you:
df = pd.DataFrame({
"col": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0],
"A": [1.1, 1.0, 1.0, 1.0, 1.0, 1.1],
"B": [1.0, 2.1, 3.0, 3.9, 5.0, 6.0],
"C": ["A", "B", "C", "D", "E", "F"]
})
df.corr()['col'] >= 0.05
df.loc[:, (df.corr()['col'] <= -0.05) | (df.corr()['col'] >= 0.05)]
Then I got:
pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
One way of fixing this is by doing so:
df = df.drop(columns=df.corr().query("-0.05 < col < 0.05").index)
Note: Please remind you'll have quicker and more relevant answers if you provide a full sample of the non-working code so that your error can be reproduced easily ;)
Upvotes: 3