sac
sac

Reputation: 215

Issue while comparing the decimal column values in pandas

I have below dataframe with 2 columns, dtypes: object for both columns

    TYP             T_TYP
0   181.23876781111 181.23876751111
1   273.98111182222 273.98111182222
2   123456575765776 889.53543543444
3   343.56TUUY87888 646546545454555
4   CGDYTFYFYHGC    455.YTTFGCFTTCT
5   0.0             123.5646546
6   local           68.46
7   TNT005          908

First I am using a regular expression to check if both columns should have the data in decimal format and all numbers using

exp = '^(\d+\.)+\d+$'
df['match'] = df['TYP'].str.match(exp) & df['T_TYP'].str.match(exp)
df

My resultant dataframe is like below now

    TYP             T_TYP           match
0   181.23876781111 181.23876751111 True
1   273.98111182222 273.98111182222 True
2   123456575765776 889.53543543444 False
3   343.56TUUY87888 646546545454555 False
4   CGDYTFYFYHGC    455.YTTFGCFTTCT False
5   0.0             123.5646546     True
6   local           68.46           False
7   TNT005          908             False

On the dataframe, I need to check if the match column value is True, then for that row compare the value for both columns TYP and T_TYP. The whole part of the value should match and the fraction part should match till 6th decimal place. If the 7th decimal place does not match then show it as a mismatch. I tried numpy where method but it is always giving me error saying ** TypeError: can't multiply sequence by non-int of type 'float' **. I did not understand why this is happening.

Requesting some help on this issue.

Upvotes: 2

Views: 869

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150735

I would do something like this:

df['output'] = (pd.to_numeric(df['TYP'], errors='coerce')
   .sub(pd.to_numeric(df['T_TYP'], errors='coerce'))
   .abs()<1e-6
)

Output:

               TYP            T_TYP  output
0  181.23876781111  181.23876751111    True
1  273.98111182222  273.98111182222    True
2  123456575765776  889.53543543444   False
3  343.56TUUY87888  646546545454555   False
4     CGDYTFYFYHGC  455.YTTFGCFTTCT   False
5              0.0      123.5646546   False
6            local            68.46   False
7           TNT005              908   False

Upvotes: 2

Related Questions