Compare float values in one column with all other columns in a pandas DataFrame

Question

I have a dataframe with mutiple columns carrying float values.

df = pd.DataFrame({
        "v0": [0.493864,0.378362,0.342887,0.308959,0.746347], 
        "v1":[0.018915,0.018535,0.019587,0.035702,0.008325],
        "v2":[0.252000,0.066746,0.092421,0.036694,0.036506],
        "v3":[0.091409,0.103887,0.098669,0.112207,0.043911],
        "v4":[0.058429,0.312115,0.342887,0.305678,0.103065],
        "v5":[0.493864,0.378362,0.338524,0.304545,0.746347]})

I need to create another column result in df by comparing value of each row in df['v0'] with the value of rows in subsequent columns v1-v5.

What i need is as below:v0 v1 v2 v3 v4 v5 Result 0 0.493864 0.018915 0.252000 0.091409 0.058429 0.493864 1 1 0.378362 0.018535 0.066746 0.103887 0.312115 0.378362 1 2 0.342887 0.019587 0.092421 0.098669 0.342887 0.338524 1 3 0.308959 0.035702 0.036694 0.112207 0.305678 0.304545 0 4 0.746347 0.008325 0.036506 0.043911 0.103065 0.746347 1

I have tried many approaches including This link and This link

But it seems the task that I require is not doable. I have been struggling on this since last couple of days. The original dataset I have has more that 60000 rows. Please suggest the best and fastest way

cs95 · Accepted Answer

A better solution for dealing with floating point comparisons is to use np.isclose with broadcasting:

df['Result'] = np.isclose(v[:,1:], v[:,[0]]).any(1).astype(int)
df
         v0        v1        v2        v3        v4        v5  Result
0  0.493864  0.018915  0.252000  0.091409  0.058429  0.493864       1
1  0.378362  0.018535  0.066746  0.103887  0.312115  0.378362       1
2  0.342887  0.019587  0.092421  0.098669  0.342887  0.338524       1
3  0.308959  0.035702  0.036694  0.112207  0.305678  0.304545       0
4  0.746347  0.008325  0.036506  0.043911  0.103065  0.746347       1

Do NOT use equality based comparisons when dealing with floats because of the possibility of floating point inaccuracies. See Is floating point math broken?

Compare float values in one column with all other columns in a pandas DataFrame

Answers (2)

Related Questions