ScienceDan
ScienceDan

Reputation: 55

Pandas, selecting a subset of a dataframe using a series

I have a pandas df.

+----------+------------+-------------+-----+--+
|   City   | First_name | Last_name   | Age |  |
+----------+------------+-------------+-----+--+
| London   | Han        | Solo        |  34 |  |
| Paris    | Luke       | Skywalker   |  30 |  |
| New York | Leia       | Organa      |  30 |  |
| LA       | Lando      | calrissian  |  40 |  |
+----------+------------+-------------+-----+--+

and a (pandas) series btained from a separate, smaller df (df2) using .loc[:,'Age']

+------------+
|    Age     |
+------------+
|    30      |
|    30      |
+------------+

I would like to select all of the rows in df1 using the information in the series. Giving something like this,

+----------+------------+-------------+-----+--+
|   City   | First_name | Last_name   | Age |  |
+----------+------------+-------------+-----+--+
| Paris    | Luke       | Skywalker   |  30 |  |
| New York | Leia       | Organa      |  30 |  |
+----------+------------+-------------+-----+--+

I have looked at the literature for .loc and .iloc, but this doesn't seem to be what I am after. I was trying to write a small for loop, but have limited experience (I'm new to programming). Does anyone have any advice?

Upvotes: 1

Views: 63

Answers (2)

Yaakov Bressler
Yaakov Bressler

Reputation: 12018

Try comparing the dataframes directly:

df[df['Age'] == df2['Age']]

Upvotes: 0

Shaunak Sen
Shaunak Sen

Reputation: 578

Assuming larger df is df1 and smaller one is df2, extract out the values of age which you want to select:

mask = df2['Age'].unique()

Then simply query df1 by this mask:

df1.loc[df1['Age'].isin(mask)]

Upvotes: 1

Related Questions