Reputation: 215
I have below dataframe with 2 columns, dtypes: object for both columns
TYP T_TYP
0 181.23876781111 181.23876751111
1 273.98111182222 273.98111182222
2 123456575765776 889.53543543444
3 343.56TUUY87888 646546545454555
4 CGDYTFYFYHGC 455.YTTFGCFTTCT
5 0.0 123.5646546
6 local 68.46
7 TNT005 908
First I am using a regular expression to check if both columns should have the data in decimal format and all numbers using
exp = '^(\d+\.)+\d+$'
df['match'] = df['TYP'].str.match(exp) & df['T_TYP'].str.match(exp)
df
My resultant dataframe is like below now
TYP T_TYP match
0 181.23876781111 181.23876751111 True
1 273.98111182222 273.98111182222 True
2 123456575765776 889.53543543444 False
3 343.56TUUY87888 646546545454555 False
4 CGDYTFYFYHGC 455.YTTFGCFTTCT False
5 0.0 123.5646546 True
6 local 68.46 False
7 TNT005 908 False
On the dataframe, I need to check if the match column value is True, then for that row compare the value for both columns TYP and T_TYP. The whole part of the value should match and the fraction part should match till 6th decimal place. If the 7th decimal place does not match then show it as a mismatch. I tried numpy where method but it is always giving me error saying ** TypeError: can't multiply sequence by non-int of type 'float' **. I did not understand why this is happening.
Requesting some help on this issue.
Upvotes: 2
Views: 869
Reputation: 150735
I would do something like this:
df['output'] = (pd.to_numeric(df['TYP'], errors='coerce')
.sub(pd.to_numeric(df['T_TYP'], errors='coerce'))
.abs()<1e-6
)
Output:
TYP T_TYP output
0 181.23876781111 181.23876751111 True
1 273.98111182222 273.98111182222 True
2 123456575765776 889.53543543444 False
3 343.56TUUY87888 646546545454555 False
4 CGDYTFYFYHGC 455.YTTFGCFTTCT False
5 0.0 123.5646546 False
6 local 68.46 False
7 TNT005 908 False
Upvotes: 2