Reputation: 995
I'm currently trying to generate a quartile map of a few key figures.
My quartiles are in a Pandas DataFrame looking like this:
0,05 0,1 0,25 0,33
IndicatorName
indicator 1 10653 10512 10096 9857
indicator 2 2,85 2,87 3,01 3,11
indicator 3 1,66 1,75 1,84 1,9
indicator 4 13,01 11,78 8,55 7,64
This is a quartile mapping from a few hundred users I then query my sql database and get values for One user and load this into a DF
value
IndicatorName
indicator1 9917.00
indicator2 3.10
indicator3 1.86
indicator4 13.74
What I want to do now is to create a new column in my second DF containing an indication of which quartile the value lies within (nearest matching value):
value quartile
IndicatorName
indicator1 9917.00 0,33
indicator2 3.10 0,33
indicator3 1.86 0,25
indicator4 13.74 0,05
How would you go about comparing dataframes like this?
Upvotes: 2
Views: 1822
Reputation: 863301
Zero step is replace ,
to .
in df1
and cast to float
s.
df1 = df1.replace(',','.', regex=True).astype(float)
Or:
df1 = pd.read_csv(file, decimal=',')
Also is necessary match indices, so if only difference is whitespaces, remove it:
df1.index = df1.index.str.replace('\s+','')
Then subtract column value
by sub
, get abs
values and find column of minimal value by DataFrame.idxmin
:
df2['quartile'] = df1.sub(df2['value'],axis=0).abs().idxmin(axis=1)
print (df2)
value quartile
IndicatorName
indicator1 9917.00 0,33
indicator2 3.10 0,33
indicator3 1.86 0,25
indicator4 13.74 0,05
Details:
print (df1.sub(df2['value'],axis=0))
0,05 0,1 0,25 0,33
IndicatorName
indicator1 736.00 595.00 179.00 -60.00
indicator2 -0.25 -0.23 -0.09 0.01
indicator3 -0.20 -0.11 -0.02 0.04
indicator4 -0.73 -1.96 -5.19 -6.10
print (df1.sub(df2['value'],axis=0).abs())
0,05 0,1 0,25 0,33
IndicatorName
indicator1 736.00 595.00 179.00 60.00
indicator2 0.25 0.23 0.09 0.01
indicator3 0.20 0.11 0.02 0.04
indicator4 0.73 1.96 5.19 6.10
Upvotes: 1