How to merge two bins in a pandas data frame?

Question

I am using pd.cut and binning data. After this step, I am finding the mean of data in each bin and if the difference in the mean between two bins are below a threshold, I want to merge the two bins together.


import pandas as pd
df = pd.DataFrame([{ 'col1': 7, 'val': 2},
                   {'col1':   20, 'val': 22},
                   {'col1':  11, 'val': 12},
                   { 'col1': 9, 'val': 13},
                   { 'col1':   14, 'val': 11}])


df['bin1']=pd.cut(df['col1'], 3)

df2 = pd.DataFrame(df.groupby('bin1')['val'].mean())

threshold = 5

Output:


                   val
bin1    
(6.987, 11.333]     9
(11.333, 15.667]    11
(15.667, 20.0]      22

if the difference of mean of val is less than the threshold (5), then i want to merge the bins.

So the new bins now should be:

                 
bin1    
(6.987, 15.667]     
(15.667, 20.0]

I don't know how to do the last step.. Thank you!

How to merge two bins in a pandas data frame?

Answers (1)

Related Questions