Reputation: 55
I have a pandas df.
+----------+------------+-------------+-----+--+
| City | First_name | Last_name | Age | |
+----------+------------+-------------+-----+--+
| London | Han | Solo | 34 | |
| Paris | Luke | Skywalker | 30 | |
| New York | Leia | Organa | 30 | |
| LA | Lando | calrissian | 40 | |
+----------+------------+-------------+-----+--+
and a (pandas) series btained from a separate, smaller df (df2) using .loc[:,'Age']
+------------+
| Age |
+------------+
| 30 |
| 30 |
+------------+
I would like to select all of the rows in df1 using the information in the series. Giving something like this,
+----------+------------+-------------+-----+--+
| City | First_name | Last_name | Age | |
+----------+------------+-------------+-----+--+
| Paris | Luke | Skywalker | 30 | |
| New York | Leia | Organa | 30 | |
+----------+------------+-------------+-----+--+
I have looked at the literature for .loc and .iloc, but this doesn't seem to be what I am after. I was trying to write a small for loop, but have limited experience (I'm new to programming). Does anyone have any advice?
Upvotes: 1
Views: 63
Reputation: 12018
Try comparing the dataframes directly:
df[df['Age'] == df2['Age']]
Upvotes: 0
Reputation: 578
Assuming larger df is df1
and smaller one is df2
, extract out the values of age which you want to select:
mask = df2['Age'].unique()
Then simply query df1
by this mask
:
df1.loc[df1['Age'].isin(mask)]
Upvotes: 1