Reputation: 4511
I have two different dataframes of the same stock, but one dataframe has more data and different prices. I want to compare one of the columns, to see where they differ. (Below are a smaller version of the dataframes)
df
Date Open Close
2007-03-22 3.65 1.0
2007-03-23 3.87 1.0
2007-03-26 3.83 1.0
2007-03-27 3.61 1.0
2007-03-28 4.65 1.0
df2
Date Open Close
2007-03-22 3.15 1.0
2007-03-23 3.87 0.0
2007-03-26 3.33 0.0
2007-03-27 3.61 0.0
2007-03-28 4.65 0.0
Since one of of the dataframe has more dates, I'm trying to slice it using loc and then use boolean indexing to find out where they differ.
I tried some like this
df.Open[df.loc['2010-01-04':, 'Open'] != df2.loc['2010-01-04':, 'Open']]
I only want to compare the "Open" columns of both dataframes, and only at a slice of dates. I want the output, to be just the df.Open column (and index) at where they differ in their respective 'Open' columns.
but I'm getting the error,
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided
Upvotes: 3
Views: 2059
Reputation: 214957
When you use Boolean indexing, the object to be subsetted must have the same length as the Boolean Series, try the following:
df.Open.loc['2010-01-04':][df.loc['2010-01-04':, 'Open'] != df2.loc['2010-01-04':, 'Open']]
The error can be reproduced with this example:
df = pd.DataFrame({"A": [1,2,3,4]})
df.A[df.loc[2:, 'A'] == df.loc[2:, 'A']]
IndexingError: Unalignable boolean Series key provided
But this works fine:
df.A.loc[2:][df.loc[2:, 'A'] == df.loc[2:, 'A']]
#2 3
#3 4
#Name: A, dtype: int64
Upvotes: 4