Reputation: 447
First dataframe d1: Index columns are State and City
**2008q2 2009q2 Price Ratio**
**State City**
Alabama Montevallo 119000.000000 125200.000000 0.950479
Tuscaloosa 139366.666667 136933.333333 1.017770
Alaska Fairbanks 254733.333333 225833.333333 1.127970
Second dataframe d2 :
State City
0 Alabama Auburn
1 Alabama Florence
2 Alabama Tuscaloosa
I need to filter out/remove rows in d1 where [State,City] is not in d2.
Upvotes: 1
Views: 69
Reputation: 447
Did the following steps :
extract the left_only entries from the outer join to get the rows_in_df1_not_in_df2.
**2008q2 2009q2 Price Ratio**
State City Alabama Montevallo 119000.000000 125200.000000 0.950479 Alaska Fairbanks 254733.333333 225833.333333 1.127970
Would like to know if there is any other way like boolean masking?
Upvotes: 0
Reputation: 863701
You can use join
with inner
join:
df = df2.join(df1, on=['State','City'], how='inner')
print (df)
State City 2008q2 2009q2 Price Ratio
2 Alabama Tuscaloosa 139366.666667 136933.333333 1.01777
Another solution with merge
, parameter on
can be omit if only intersection of same columns in both df
:
df = pd.merge(df1.reset_index(), df2)
#df = pd.merge(df1.reset_index(), df2, on=['State','City'])
print (df)
State City 2008q2 2009q2 Price Ratio
0 Alabama Tuscaloosa 139366.666667 136933.333333 1.01777
Upvotes: 1