Reputation: 127
I want to find which column's date is closest to the date in cv_date_x column and return True there. I tried the below code but it returned False to all columns. Is there any solutions?? Thank you so much.
pre_df = pre_df.apply(lambda x: pd.to_datetime(x, errors='coerce'))
# set datetime to all values
delete_col = ['cv_date_x','cv_date_y','userid']
pre_cols = pre_df.columns.drop(delete_col)# remove unnecessary cols
pre_df1 = pre_df.apply(lambda x: x == np.min(abs(x[pre_cols] - x['cv_date_x'])), axis=1)
print(pre_df1)
The output of "print(pre_df1)"
cv_date_x,col1,col2,col3
False,False,False,False
pre_df
cv_date_x,col1,col2,col3
2021-07-01,2021-05-01,2021-07-02,2021-07-30
⇩
ideal output
cv_date_x,col1,col2,col3
False,False,True,False
because 2021-07-02 is closest to 2021-07-01
Upvotes: 1
Views: 74
Reputation: 71689
print(df)
cv_date_x col1 col2 col3
0 2021-07-01 2021-05-01 2021-07-02 2021-07-30
1 2021-07-28 2021-05-01 2021-07-02 2021-07-30
Numpy
a = df.to_numpy()
i = np.abs(a[:, 1:] - a[:, 0, None]).argmin(1) + 1
df_out = pd.DataFrame(i[:, None] == range(df.shape[1]), columns=df.columns)
pandas
df_out = (
df.drop('cv_date_x', axis=1)
.sub(df['cv_date_x'], axis=0)
.abs().idxmin(1).str.get_dummies()
.reindex(df.columns, axis=1, fill_value=0)
.astype('bool')
)
print(df_out)
cv_date_x col1 col2 col3
0 False False True False
1 False False False True
Upvotes: 2