pandas add column to dataframe having the value from another row based on condition

Question

I have a dataframe with columns named 'id', 'x', 'y', and 'time'

id	time	x	y
1	0	14	12
1	1	32	23
1	2	52	14
2	2	12	34
3	0	62	17
3	1	82	35
3	2	22	25

I want to add two columns to the dataframe so that they have the value of x and y from another row having the same id and a time + 2

the result should like this:

id	time	x	y	x2	y2
1	0	14	12	52	14
1	1	32	23
1	2	52	14
2	2	12	34
3	0	62	17	22	25
3	1	82	35
3	2	22	25

please note that the dataframe is not sorted by id

I have tried the following for x2 but it is not working as intended:

t=2
data['x2'] = data.apply(lambda x: x['x'] if (data[(data['id']==x['id']) & ((data['time']+t) == x['time'])].size > 0) else '', axis=1)

The following works but I need to use a shortcut way and the one with the best performance because my data is huge

t=2
for index, row in data.iterrows():    
    rowT = data[(data['id']==row['id']) & (data['time'] == (row['time'] + t))]
    if rowT.size > 0:
      data.loc[index,'x2'] = rowT['x'].values[0]

Shubham Sharma · Accepted Answer

You can create a new dataframe by repopulating the values in time column with the values at t-2 seconds, then left merge this new dataframe with the original dataframe on the columns id, time to get the result:

df_r = df.assign(time=df['time'].sub(2))
df.merge(df_r, on=['id', 'time'], how='left', suffixes=['', '2'])

   id  time   x   y    x2    y2
0   1     0  14  12  52.0  14.0
1   1     1  32  23   NaN   NaN
2   1     2  52  14   NaN   NaN
3   2     2  12  34   NaN   NaN
4   3     0  62  17  22.0  25.0
5   3     1  82  35   NaN   NaN
6   3     2  22  25   NaN   NaN

pandas add column to dataframe having the value from another row based on condition

Answers (2)

Related Questions