Reputation: 337
I am working with a dataframe like this:
import pandas as pd
import datetime
records = [{'Name':'John', 'Start':'2020-01-01','Stop':'2020-03-31'}, {'Name':'John', 'Start':'2020-04-01','Stop':'2020-12-31'},
{'Name':'Mary', 'Start':'2020-01-01','Stop':'2020-03-15'}, {'Name':'Mary', 'Start':'2020-03-16','Stop':'2020-03-31'},
{'Name':'Mary', 'Start':'2020-04-01','Stop':'2020-12-31'}, {'Name':'Stan', 'Start':'2020-02-01','Stop':'2020-03-31'},
{'Name':'Stan', 'Start':'2020-04-01','Stop':'2020-12-31'}]
df = pd.DataFrame(records)
df['Start'] = pd.to_datetime(df['Start'])
df['Stop'] = pd.to_datetime(df['Stop'])
df
which gives the output
Name Start Stop
0 John 2020-01-01 2020-03-31
1 John 2020-04-01 2020-12-31
2 Mary 2020-01-01 2020-03-15
3 Mary 2020-03-16 2020-03-31
4 Mary 2020-04-01 2020-12-31
5 Stan 2020-02-01 2020-03-31
6 Stan 2020-04-01 2020-12-31
What I want to do is select all the records for all the individuals who have a start date of 2020-01-01. That is, if someone doesn't have a record beginning on 1/1, then I don't want any of their records. The results should give me this:
Name Start Stop
0 John 2020-01-01 2020-03-31
1 John 2020-04-01 2020-12-31
2 Mary 2020-01-01 2020-03-15
3 Mary 2020-03-16 2020-03-31
4 Mary 2020-04-01 2020-12-31
There should be no records for Stan in the output, because none of his entries start with 2020-01-01. Any ideas on how to accomplish this? Thanks!
Upvotes: 1
Views: 252
Reputation: 75080
Try the condition grouped by transform:
df[df['Start'].eq("2020-01-01").groupby(df["Name"]).transform('any')]
Name Start Stop
0 John 2020-01-01 2020-03-31
1 John 2020-04-01 2020-12-31
2 Mary 2020-01-01 2020-03-15
3 Mary 2020-03-16 2020-03-31
4 Mary 2020-04-01 2020-12-31
Upvotes: 1