How to compare value of second column with same values of first column in pandas dataframe?

Question

How to extract and compare values of second column in a data frame for all the same values of first column in same data frame?

I have a data frame as 'df':

Name         Datetime
Bob          26-04-2018 12:00:00
Claire       26-04-2018 12:00:00
Bob          26-04-2018 12:30:00
Grace        27-04-2018 08:30:00
Bob          27-04-2018 09:30:00

I want to add a new column to the data frame as df['Id'] such that, for users having same names, if the datetime values have a difference of not more than 30 min, they would be assigned the same value of Id, and if the datetime difference is greater than 30 min, then it would be assigned a different id.

I think it could be achieved using iterating over loops but I am not sure how to do it. Also, is there a better way to do this as I have a huge data set?

My expected output of the data frame would be as:

Name         Datetime                 Id
Bob          26-04-2018 12:00:00      1
Claire       26-04-2018 12:00:00      2
Bob          26-04-2018 12:10:00      1
Bob          26-04-2018 12:20:00      1
Claire       27-04-2018 08:30:00      3
Bob          27-04-2018 09:30:00      4

Any help would be appreciated. Thanks

Andy L. · Accepted Answer

I think it is simple using groupby, grouper and ngroup as follows:

df['Id'] = df.groupby([pd.Grouper(freq='30T', key='Datetime'), 'Name']).ngroup().add(1)


Out[423]:
     Name            Datetime  Id
0     Bob 2018-04-26 12:00:00   1
1  Claire 2018-04-26 12:00:00   2
2     Bob 2018-04-26 12:10:00   1
3     Bob 2018-04-26 12:20:00   1
4  Claire 2018-04-27 08:30:00   3
5     Bob 2018-04-27 09:30:00   4

How to compare value of second column with same values of first column in pandas dataframe?

Answers (2)

Related Questions