Moondra
Moondra

Reputation: 4511

Boolean indexing using loc resulting in an error

I have two different dataframes of the same stock, but one dataframe has more data and different prices. I want to compare one of the columns, to see where they differ. (Below are a smaller version of the dataframes)

df

Date          Open    Close
2007-03-22    3.65      1.0
2007-03-23    3.87      1.0
2007-03-26    3.83      1.0
2007-03-27    3.61      1.0
2007-03-28    4.65      1.0

df2

Date          Open   Close
2007-03-22    3.15    1.0
2007-03-23    3.87    0.0
2007-03-26    3.33    0.0
2007-03-27    3.61    0.0
2007-03-28    4.65    0.0

Since one of of the dataframe has more dates, I'm trying to slice it using loc and then use boolean indexing to find out where they differ.

I tried some like this

df.Open[df.loc['2010-01-04':, 'Open'] != df2.loc['2010-01-04':, 'Open']]

I only want to compare the "Open" columns of both dataframes, and only at a slice of dates. I want the output, to be just the df.Open column (and index) at where they differ in their respective 'Open' columns.

but I'm getting the error,

pandas.core.indexing.IndexingError: Unalignable boolean Series key provided

Upvotes: 3

Views: 2059

Answers (1)

akuiper
akuiper

Reputation: 214957

When you use Boolean indexing, the object to be subsetted must have the same length as the Boolean Series, try the following:

df.Open.loc['2010-01-04':][df.loc['2010-01-04':, 'Open'] != df2.loc['2010-01-04':, 'Open']]

The error can be reproduced with this example:

df = pd.DataFrame({"A": [1,2,3,4]})    
df.A[df.loc[2:, 'A'] == df.loc[2:, 'A']]

IndexingError: Unalignable boolean Series key provided

But this works fine:

df.A.loc[2:][df.loc[2:, 'A'] == df.loc[2:, 'A']]

#2    3
#3    4
#Name: A, dtype: int64

Upvotes: 4

Related Questions