sjkluend
sjkluend

Reputation: 9

Pull value from dataframe based on date

Suppose I have the following data frame below:

   userid   recorddate
0    tom    2018-06-12
1   nick    2019-06-01
2    tom    2018-02-12
3   nick    2019-06-02

How would I go about determining and pulling the value for the earliest recorddate for each user. i.e. 2018-02-12 for tom and 2019-06-01 for nick?

In addition, what if I added a parameter such as the earliest recorddate that is greater than 2019-01-01?

Upvotes: 0

Views: 43

Answers (2)

coco18
coco18

Reputation: 1085

Here a solution with loc

df['recorddate'] = pd.to_datetime(df['recorddate'])
date = pd.to_datetime("2019-01-01")
df.loc[df['recorddate']>date]

Output will be:

    userid  recorddate
1   nick    2019-06-01
3   nick    2019-06-02

you can change the greater sign with equal or smaller sign to get a different result. Cheers

Upvotes: 1

Chris
Chris

Reputation: 16147

Everything will be easier if you convert your date strings into datetime objects. Once that's done you can sort them then take the first record per userid. Additionally you can filter the dataframe by passing a date string in your conditional, and proceed the same way.

df['recorddate'] = pd.to_datetime(df['recorddate'])
df.sort_values(by='recorddate', inplace=True)
df.groupby('userid').first()

output

       recorddate
userid
nick   2019-06-01
tom    2018-02-12

or

df[df['recorddate']>'2019-01-01'].groupby('userid').first()

Upvotes: 0

Related Questions