sara
sara

Reputation: 45

How to merge two rows having same values into single row in python?

I am having a table called 'data' in that the values will be like following,

ID  NAME    DOB LOCATION
1   bob 08/10/1985  NEW JERSEY
1   bob 15/09/1987  NEW YORK
2   John    08/10/1985  NORTH CAROLINA
2   John    26/11/1990  OKLAHOMA

For example I want output like,

ID  NAME    No.of.Days                 
1   bob difference of two given dates in days   
2   John    difference of two given dates in days

Please help me to form a python code to get the expected output.

Upvotes: 2

Views: 259

Answers (2)

Rheatey Bash
Rheatey Bash

Reputation: 847

First, You need to convert Date column into date format. Lets suppose you are reading from .csv then read your .csv file as follows

df = pd.read_csv('yourfile.csv', parse_dates = ['DOB'])

otherwise, convert your existing dataframe column into date format as follows.

df['DOB'] = pd.to_datetime(df['DOB'])

now, you can perform the usual numeric operations.

df.groupby(['ID','NAME'])['DOB'].apply(lambda x: abs(pd.to_datetime(list(x)[0]) - pd.to_datetime(list(x)[1]))).reset_index(name='No.Of.Days')

Upvotes: 1

iamklaus
iamklaus

Reputation: 3770

If there will be only two dates in a for a given ID then below works!

df.groupby(['ID','NAME'])['DOB'].apply(lambda x: abs(pd.to_datetime(list(x)[0]) - pd.to_datetime(list(x)[1]))).reset_index(name='No.Of.Days')

Output

   ID   NAME  No.Of.Days
0    1    bob   766 days
1    2   John  1934 days

you can use np.diff also

df.groupby(['ID','NAME'])['DOB'].apply(lambda x: np.diff(list(x))[0]).reset_index(name='No.Of.Days')

Upvotes: 1

Related Questions