Reputation: 551
I have a pandas df with 3 columns: col1
, col2
, col3
. Values in column 1 and 3 may be empty in some cases. Column col2
starts empty. Goal is to populate col2
. I would like to iterate each row to compare each col3
value of the current row to col1
in the next row. col2
should become the smallest date value (as you can see in the image).
How can I do this in pandas?
Upvotes: 0
Views: 157
Reputation: 59519
Use np.min
with shift
import pandas as pd
import numpy as np
df = pd.DataFrame({'col1': ['2013-12-19', '2014-12-16', '2015-02-06', '2016-01-22',
'2016-02-24', '2016-04-25', '2017-04-13'],
'col3': ['2014-06-28', '2015-10-07', '2015-07-19', '2016-02-11',
'2016-04-28', '2017-02-28', '2018-02-15']})
df = df.apply(pd.to_datetime)
df['col2'] = np.min([df.col1.shift(-1).values, df.col3.values], axis=0)
df
: col1 col3 col2
0 2013-12-19 2014-06-28 2014-06-28
1 2014-12-16 2015-10-07 2015-02-06
2 2015-02-06 2015-07-19 2015-07-19
3 2016-01-22 2016-02-11 2016-02-11
4 2016-02-24 2016-04-28 2016-04-25
5 2016-04-25 2017-02-28 2017-02-28
6 2017-04-13 2018-02-15 2018-02-15
Upvotes: 1
Reputation: 410
My apologies...I misread your question. I admit I skimmed reading it - sorry!
This should work...
for idx in range(len(df)-1):
df.loc[idx, 'col2'] = min(df.loc[idx, 'col3'], df.loc[idx+1, 'col1'])
This will leave the last value in col2
as nan
due to the offset iteration.
Let me know if that helps!
Upvotes: 0